KR20230103671A

KR20230103671A - Method and apparatus for providing a metaverse virtual reality companion animal capable of forming a relationship based on a user's emotions in a communication system

Info

Publication number: KR20230103671A
Application number: KR1020210194706A
Authority: KR
Inventors: 임세라
Original assignee: 주식회사 마블러스
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2023-07-07

Abstract

본 발명은 통신 시스템에서 사용자의 감정에 기반하여 관계를 형성할 수 있는 메타버스 가상 현실 반려 동물을 제공하기 위한 방법 및 장치에 관한 것이다. 구체적으로, 본 발명은 통신 시스템에서 메타버스의 가상 현실 내 반려 동물 아바타를 생성하고, 반려 동물 아바타의 동작 화면에 대한 사용자의 반응 얼굴 이미지로부터 사용자의 반응 감정 정보를 분석한 뒤, 사용자의 반응 감정 정보에 기반하여 반려 동물 아바타가 상호 반응 동작을 수행하고, 또한, 사용자의 반응 감정 정보에 기반하여 반려 동물 아바타의 특정 행동 패턴 등에 대한 사용자의 피드백을 반영하여 반려 동물 아바타의 행동 패턴, 성격에 대한 수정 및 강화 학습을 수행함으로써, 사용자와 감정적으로 교감하고 관계를 형성할 수 있는 반려 동물 아바타를 제공하기 위한 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for providing a metaverse virtual reality companion animal capable of forming a relationship based on a user's emotion in a communication system. Specifically, the present invention generates a companion animal avatar in the virtual reality of the metaverse in a communication system, analyzes the user's reaction emotion information from the user's reaction face image to the operation screen of the companion animal avatar, and then the user's reaction emotion Based on the information, the companion animal avatar performs a mutual reaction action, and based on the user's reaction and emotion information, the user's feedback on the specific behavior pattern of the companion animal avatar is reflected to determine the behavior pattern and personality of the companion animal avatar. A method and apparatus for providing a companion animal avatar that can emotionally communicate with and form a relationship with a user by performing modification and reinforcement learning.

Description

Method and apparatus for providing a metaverse virtual reality companion animal capable of forming a relationship based on a user's emotion in a communication system IN A COMMUNICATION SYSTEM}

최근 1인 가구가 증가하면서 인간과 함께 가족과 같이 생활할 수 있는 반려 동물에 대한 수요가 증가하고 있다. 반려 동물을 키우는 사람들이 일반적으로 반려 동물에게 기대하는 것은 감정을 가진 존재로서 반려 동물과 감정적인 교감을 수행하는 것이다. 그러나, 사람들은 반려 동물과 함께 생활하며 감정적 교감을 주고받기를 원하면서도, 현실적으로 관리 책임의 부담, 시간, 비용, 공간의 문제 등 현실적인 이유로 반려 동물을 키우는 것에 어려움을 느끼고 있다.With the recent increase in single-person households, the demand for companion animals that can live with humans as a family is increasing. What people who have companion animals generally expect from their companion animals is to perform an emotional connection with them as beings with feelings. However, while people live with companion animals and want to exchange emotional sympathy, they feel difficulties in raising companion animals for realistic reasons such as burden of management responsibility, time, cost, and space problems.

최근 메타버스(metaverse)의 가상 현실 관련 기술이 주목을 받고 있다. 메타버스는 가공, 추상을 의미하는 메타(meta)와 현실 세계를 의미하는 유니버스(universe)의 합성어로, 가상 세계를 뜻하는 용어이다. 메타버스로 인하여 현실 세계와 가상 세계의 경계가 점차 희미해지고 있다. 현실 세계에서 수행하는 여러가지 행동을 가상 세계에서 동일 또는 유사하게 진행할 수 있기 때문이다.Recently, the virtual reality-related technology of the metaverse is attracting attention. Metaverse is a compound word of meta, which means processing and abstraction, and universe, which means the real world, and is a term that means the virtual world. Due to the metaverse, the boundary between the real world and the virtual world is gradually blurring. This is because various actions performed in the real world can be performed identically or similarly in the virtual world.

메타버스의 가상 현실에서 반려 동물 아바타를 생성하고, 반려 동물 아바타의 동작 화면에 대한 사용자의 반응 얼굴 이미지로부터 사용자의 반응 감정 정보를 분석한 뒤, 사용자의 반응 감정 정보에 기반하여 반려 동물 아바타가 상호 반응 동작을 수행한다면, 사용자와 메타버스 가상 현실 속 반려 동물이 감정적으로 교감을 하는 경험을 사용자가 가질 수 있다. 또한, 사용자의 반응 감정 정보에 기반하여 반려 동물 아바타의 특정 행동 패턴 등에 대한 사용자의 피드백을 반영하여 반려 동물 아바타의 행동 패턴, 성격에 대한 수정 및 강화 학습을 수행한다면, 사용자의 좋고 싫은 반응을 반려 동물 아바타가 인지하고 반영할 수 있기 때문에, 사용자가 반려 동물 아바타를 키우면서 감정적으로 관계를 형성할 수 있다.After creating a companion animal avatar in the virtual reality of the metaverse, analyzing the user's reaction emotion information from the user's reaction face image to the motion screen of the companion animal avatar, and based on the user's reaction emotion information, the companion animal avatar interacts with each other. If the reaction action is performed, the user can have the experience of emotionally sympathizing with the user and the companion animal in the metaverse virtual reality. In addition, based on the user's reaction and emotion information, if the user's feedback on the specific behavior pattern of the companion animal avatar is reflected and the behavior pattern and personality of the companion animal avatar are modified and reinforced, the user's good and disliked reactions are rejected. Since animal avatars can recognize and reflect, users can form emotional relationships with companion animal avatars while raising them.

따라서, 통신 시스템에서 사용자의 감정에 기반하여 관계를 형성할 수 있는 메타버스 가상 현실 반려 동물을 제공하기 위한 방법 및 장치가 필요한 실정이다.Therefore, there is a need for a method and apparatus for providing a metaverse virtual reality companion animal capable of forming a relationship based on a user's emotion in a communication system.

대한민국 등록특허 번호 제10-2199843호 (2020.12.31.) (증강현실에서 가상의 반려 동물 제공 시스템)Republic of Korea Patent No. 10-2199843 (2020.12.31.) (Virtual companion animal provision system in augmented reality)

본 발명은 전술한 문제점을 해결하기 위하여 다음과 같은 해결 과제를 목적으로 한다.The present invention aims to solve the following problems in order to solve the above problems.

본 발명은 통신 시스템에서 사용자의 감정에 기반하여 관계를 형성할 수 있는 메타버스 가상 현실 반려 동물을 제공하기 위한 방법 및 장치를 제공하는 것을 목적으로 한다.An object of the present invention is to provide a method and apparatus for providing a metaverse virtual reality companion animal capable of forming a relationship based on a user's emotion in a communication system.

본 발명은 통신 시스템에서 메타버스의 가상 현실 내 반려 동물 아바타를 생성하고, 반려 동물 아바타의 동작 화면에 대한 사용자의 반응 얼굴 이미지로부터 사용자의 반응 감정 정보를 분석한 뒤, 사용자의 반응 감정 정보에 기반하여 반려 동물 아바타가 상호 반응 동작을 수행하기 위한 방법 및 장치를 제공하는 것을 목적으로 한다.The present invention generates a companion animal avatar in the virtual reality of the metaverse in a communication system, analyzes the user's reaction emotion information from the user's reaction face image to the operation screen of the companion animal avatar, and then based on the user's reaction emotion information Accordingly, an object of the present invention is to provide a method and device for a companion animal avatar to perform an interactive reaction operation.

본 발명은 통신 시스템에서 사용자의 반응 감정 정보에 기반하여 반려 동물 아바타의 특정 행동 패턴 등에 대한 사용자의 피드백을 반영하여 반려 동물 아바타의 행동 패턴, 성격에 대한 수정 및 강화 학습을 수행하기 위한 방법 및 장치를 제공하는 것을 목적으로 한다.The present invention provides a method and apparatus for modifying and reinforcing the behavior pattern and personality of a companion animal avatar by reflecting user feedback on a specific behavior pattern of a companion animal avatar based on the user's reaction and emotion information in a communication system. It aims to provide

본 발명은 통신 시스템에서 사용자와 감정적으로 교감하고 관계를 형성할 수 있는 반려 동물 아바타를 제공하기 위한 방법 및 장치를 제공하는 것을 목적으로 한다.An object of the present invention is to provide a method and apparatus for providing a companion animal avatar that can emotionally communicate with and form a relationship with a user in a communication system.

본 발명의 해결과제는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 해결과제들은 아래의 기재로부터 당해 기술분야에 있어서의 통상의 지식을 가진 자가 명확하게 이해할 수 있을 것이다.The problems of the present invention are not limited to those mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the description below.

본 발명의 다양한 실시 예들은 통신 시스템에서 사용자의 단말의 동작 방법을 제공한다. 상기 단말은 송수신기, 메모리, 프로세서, 입력 장치, 출력 장치, 카메라를 포함한다. 상기 방법은, 상기 입력 장치를 통해 입력 받은 반려 동물의 설정 정보를 상기 서버에게 전송하는 과정과, 상기 서버로부터 가상 공간 내 반려 동물 아바타의 외관 및 동작에 관한 제1 동작 정보를 수신하는 과정과, 상기 제1 동작 정보에 기반하여 상기 출력 장치를 통해 상기 반려 동물 아바타의 동작을 출력하는 과정과, 상기 카메라를 통해 상기 출력 장치의 상기 반려 동물 아바타를 바라보는 상기 사용자의 얼굴 이미지를 획득하는 과정과, 상기 서버에게 상기 얼굴 이미지에 기반하여 생성한 상기 사용자의 감성 정보를 전송하는 과정과, 상기 서버로부터 상기 감성 정보에 기반하여 생성된 상기 반려 동물 아바타의 제2 동작 정보를 수신하는 과정과, 상기 제2 동작 정보에 기반하여 상기 출력 장치를 통해 상기 가상 공간 내 상기 반려 동물 아바타의 동작을 출력하는 과정을 포함한다.Various embodiments of the present disclosure provide a method of operating a user's terminal in a communication system. The terminal includes a transceiver, a memory, a processor, an input device, an output device, and a camera. The method includes: transmitting setting information of a companion animal input through the input device to the server; receiving first motion information about an appearance and motion of a companion animal avatar in a virtual space from the server; The process of outputting the motion of the companion animal avatar through the output device based on the first motion information, and the process of obtaining a face image of the user looking at the companion animal avatar of the output device through the camera; The process of transmitting the emotion information of the user generated based on the face image to the server, and the process of receiving second motion information of the companion animal avatar generated based on the emotion information from the server; and outputting the motion of the companion animal avatar in the virtual space through the output device based on the second motion information.

본 발명의 다양한 실시 예들은 통신 시스템에서 사용자의 단말을 제공한다. 상기 단말은, 메모리, 프로세서, 입력 장치, 출력 장치, 카메라를 포함하고, 상기 프로세서는 본 발명의 다양한 실시 예들에 따른 통신 시스템에서 사용자의 단말의 동작 방법을 수행하도록 구성된다.Various embodiments of the present invention provide a user's terminal in a communication system. The terminal includes a memory, a processor, an input device, an output device, and a camera, and the processor is configured to perform a method of operating a user terminal in a communication system according to various embodiments of the present disclosure.

본 발명의 다양한 실시 예들은 컴퓨터 판독 가능한 저장 매체에 기록된 컴퓨터 프로그램을 제공한다. 상기 컴퓨터 프로그램은 본 발명의 다양한 실시 예들에 따른 통신 시스템에서 사용자의 단말의 동작 방법을 수행하도록 구성된다.Various embodiments of the present invention provide a computer program recorded on a computer readable storage medium. The computer program is configured to perform a method of operating a user's terminal in a communication system according to various embodiments of the present disclosure.

본 발명은 통신 시스템에서 사용자의 감정에 기반하여 관계를 형성할 수 있는 메타버스 가상 현실 반려 동물을 제공하기 위한 방법 및 장치를 제공할 수 있다.The present invention may provide a method and apparatus for providing a metaverse virtual reality companion animal capable of forming a relationship based on a user's emotion in a communication system.

본 발명은 통신 시스템에서 메타버스의 가상 현실 내 반려 동물 아바타를 생성하고, 반려 동물 아바타의 동작 화면에 대한 사용자의 반응 얼굴 이미지로부터 사용자의 반응 감정 정보를 분석한 뒤, 사용자의 반응 감정 정보에 기반하여 반려 동물 아바타가 상호 반응 동작을 수행하기 위한 방법 및 장치를 제공할 수 있다.The present invention generates a companion animal avatar in the virtual reality of the metaverse in a communication system, analyzes the user's reaction emotion information from the user's reaction face image to the operation screen of the companion animal avatar, and then based on the user's reaction emotion information Thus, a method and apparatus for performing a mutual reaction action by a companion animal avatar may be provided.

본 발명은 통신 시스템에서 사용자의 반응 감정 정보에 기반하여 반려 동물 아바타의 특정 행동 패턴 등에 대한 사용자의 피드백을 반영하여 반려 동물 아바타의 행동 패턴, 성격에 대한 수정 및 강화 학습을 수행하기 위한 방법 및 장치를 제공할 수 있다.The present invention provides a method and apparatus for modifying and reinforcing the behavior pattern and personality of a companion animal avatar by reflecting user feedback on a specific behavior pattern of a companion animal avatar based on the user's reaction and emotion information in a communication system. can provide.

본 발명은 통신 시스템에서 사용자와 감정적으로 교감하고 관계를 형성할 수 있는 반려 동물 아바타를 제공하기 위한 방법 및 장치를 제공할 수 있다.The present invention may provide a method and apparatus for providing a companion animal avatar that can emotionally communicate with and form a relationship with a user in a communication system.

본 발명의 효과는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 효과들은 아래의 기재로부터 당해 기술분야에 있어서의 통상의 지식을 가진 자가 명확하게 이해할 수 있을 것이다.Effects of the present invention are not limited to those mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 본 발명의 다양한 실시 예들에 따른 통신 시스템을 도시한다.
도 2는 본 발명의 다양한 실시 예들에 따른 사용자 단말의 구성에 대한 블록도를 도시한다.
도 3은 본 발명의 다양한 실시 예들에 따른 서버의 구성에 대한 블록도를 도시한다.
도 4는 본 발명의 다양한 실시 예들에 따라서 사용자 단말이 메타버스 가상 현실 속 반려 동물 아바타에 대한 행위 지시를 음성 메시지로 수행하기 위한 과정의 일 예를 도시한다.
도 5는 본 발명의 다양한 실시 예들에 따른 기계 학습 모델의 구조를 도시한다.
도 6은 본 발명의 다양한 실시 예들에 따른 생체 정보 획득 과정의 일 예를 도시한다.
도 7은 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.
도 8은 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.
도 9는 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.
도 10은 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.
도 11은 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.
도 12는 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.
도 13은 본 발명의 다양한 실시 예들에 따라서 사용자의 화상 이미지로부터 기계 학습 기반으로 사용자의 감정을 인식하는 과정의 일 예를 도시한다.
도 14는 본 발명의 다양한 실시 예들에 따라서 사용자의 이미지 및 음성에서 추출한 감정으로 상호 작용 분류 별 교감도를 부여하거나 차감하는 과정의 일 예를 도시한다.
도 15는 본 발명의 다양한 실시 예들에 따라서 사용자의 얼굴 인식에 따른 사용자의 감정 피드백을 반영하여 반려 동물 아바타의 동작이 결정되는 과정의 일 예를 도시한다.
도 16은 본 발명의 다양한 실시 예들에 따라서 메타버스 가상 현실 속 반려 동물 아바타를 최초 분양 받는 과정의 일 예를 도시한다.
도 17은 본 발명의 다양한 실시 예들에 따라서 반려 동물 아바타에게 음성으로 이름을 지어주고 인식시키는 과정의 일 예를 도시한다.
도 18은 본 발명의 다양한 실시 예들에 따라서 반려 동물 아바타와 훈련을 통해 사용자 감정 인식에 따른 교감도를 반영하는 과정의 일 예를 도시한다.1 illustrates a communication system according to various embodiments of the present invention.
2 shows a block diagram of a configuration of a user terminal according to various embodiments of the present invention.
Figure 3 shows a block diagram of the configuration of a server according to various embodiments of the present invention.
4 illustrates an example of a process for a user terminal to perform an action instruction for a companion animal avatar in a metaverse virtual reality as a voice message according to various embodiments of the present disclosure.
5 illustrates the structure of a machine learning model according to various embodiments of the present invention.
6 illustrates an example of a biometric information acquisition process according to various embodiments of the present disclosure.
7 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.
8 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.
9 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.
10 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.
11 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.
12 illustrates an example of a process of recognizing a user's emotion or emotion based on a visual image according to various embodiments of the present disclosure.
13 illustrates an example of a process of recognizing a user's emotion based on machine learning from a user's video image according to various embodiments of the present disclosure.
FIG. 14 illustrates an example of a process of assigning or subtracting a degree of sympathy for each interaction class with emotion extracted from a user's image and voice according to various embodiments of the present disclosure.
15 illustrates an example of a process of determining the motion of a companion animal avatar by reflecting the user's emotion feedback according to the user's face recognition according to various embodiments of the present disclosure.
16 illustrates an example of a process of initially pre-selling companion animal avatars in metaverse virtual reality according to various embodiments of the present disclosure.
17 illustrates an example of a process of naming and recognizing companion animal avatars by voice according to various embodiments of the present disclosure.
18 illustrates an example of a process of reflecting a degree of rapport according to user emotion recognition through training with a companion animal avatar according to various embodiments of the present disclosure.

이하, 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다.Hereinafter, with reference to the accompanying drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily carry out the present invention. The present invention may be embodied in many different forms and is not limited to the embodiments described herein.

도 1은 본 발명의 다양한 실시 예들에 따른 통신 시스템을 도시한다.1 illustrates a communication system according to various embodiments of the present invention.

도 1을 참고하면, 본 발명의 다양한 실시 예들에 따른 통신 시스템은 사용자 단말(100: 100-1, 100-2, ..., 100-n), 서버(200), 유/무선 통신 네트워크(300)를 포함한다.Referring to FIG. 1 , a communication system according to various embodiments of the present invention includes a user terminal (100: 100-1, 100-2, ..., 100-n), a server 200, a wired/wireless communication network ( 300).

사용자 단말(100: 100-1, 100-2, ..., 100-n)은 가상 현실 속 반려 동물 아바타를 키우는 사람인 반려인, 즉, 사용자에 의하여 운영되는 단말이다. 사용자 단말(100: 100-1, 100-2, ..., 100-n)은, 유/무선 통신 네트워크(300)를 통하여 서버(200)에게 반려 동물의 특성 정보를 전송하고, 서버(200)로부터 가상 현실 내 반려 동물 아바타의 정보, 반려 동물 아바타가 속한 가상 현실의 정보를 수신할 수 있는 전자 장치이다. 사용자 단말(100: 100-1, 100-2, ..., 100-n)은 퍼스널 컴퓨터, 셀룰러 폰, 스마트 폰, 태블릿 컴퓨터, 송수신이 가능한 VR(virtual reality)용 HMD(head mount display) 장치 등과 같이, 정보를 입력할 수 있는 입력 장치, 정보를 출력할 수 있는 출력 장치, 정보를 저장할 수 있는 메모리, 전면의 오브젝트에 대하여 이미지 촬영을 수행할 수 있는 카메라, 정보의 송수신을 수행할 수 있는 송수신기, 정보의 연산을 수행할 수 있는 적어도 하나의 프로세서를 포함하는 전자 장치일 수 있다.The user terminals 100 (100-1, 100-2, ..., 100-n) are terminals operated by companions, that is, users who raise companion animal avatars in virtual reality. The user terminal (100: 100-1, 100-2, ..., 100-n) transmits the characteristic information of the companion animal to the server 200 through the wired/wireless communication network 300, and the server 200 ) is an electronic device capable of receiving companion animal avatar information in virtual reality and virtual reality information to which the companion animal avatar belongs. User terminals 100: 100-1, 100-2, ..., 100-n are personal computers, cellular phones, smart phones, tablet computers, head mount display (HMD) devices for VR (virtual reality) capable of transmitting and receiving An input device capable of inputting information, an output device capable of outputting information, a memory capable of storing information, a camera capable of taking an image of an object in the foreground, and a device capable of transmitting and receiving information. It may be an electronic device including a transceiver and at least one processor capable of performing information calculation.

서버(200)는 메타버스 가상 현실 플랫폼 기반 온라인 서비스 제공자에 의하여 운영되는 서버이다. 서버(200)는 유/무선 통신 네트워크(300)를 통하여 사용자 단말(100: 100-1, 100-2, ..., 100-n)에 의하여 제공되는 반려 동물의 특성 정보에 기반하여 사용자가 의도하는 반려 동물과 외모, 행동 패턴, 성격이 동일 또는 유사한 가상 현실 동물 아바타를 생성하고, 사용자로부터 입력된 특성 정보 또는 피드백 정보에 따라서 가상 현실 동물 아바타의 행동 패턴 및 성격을 훈련 또는 수정하고, 생성된 외모, 행동 패턴, 성격에 기반하여 사용자와 가상 현실 아바타 간 시청각적 커뮤니케이션을 수행하는 서비스의 정보를 사용자 단말(100)에게 제공할 수 있는 전자 장치이다. 서버(200)는 정보를 저장할 수 있는 메모리, 정보의 송수신을 수행할 수 있는 송수신기, 정보의 연산을 수행할 수 있는 적어도 하나의 프로세서를 포함하는 전자 장치일 수 있다.The server 200 is a server operated by an online service provider based on a metaverse virtual reality platform. The server 200 allows the user to search based on the companion animal's characteristic information provided by the user terminals 100 (100-1, 100-2, ..., 100-n) through the wired/wireless communication network 300. A virtual reality animal avatar with the same or similar appearance, behavioral pattern, and personality as the intended companion animal is created, and the behavioral pattern and personality of the virtual reality animal avatar are trained or modified according to the characteristic information or feedback information input from the user, and the creation An electronic device capable of providing the user terminal 100 with service information for performing audio-visual communication between the user and the virtual reality avatar based on the user's appearance, behavior pattern, and personality. The server 200 may be an electronic device including a memory capable of storing information, a transceiver capable of transmitting and receiving information, and at least one processor capable of performing information calculation.

유/무선 통신 네트워크(300)는, 사용자 단말(100: 100-1, 100-2, ..., 100-n), 및 서버(200)가 서로 신호 및 데이터를 송수신할 수 있는 통신 경로를 제공한다. 유/무선 통신 네트워크(300)는 특정한 통신 프로토콜에 따른 통신 방식에 한정되지 않으며, 구현 예에 따라 적절한 통신 방식이 사용될 수 있다. 예를 들어, 인터넷 프로토콜(internet protocol, IP) 기초의 시스템으로 구성되는 경우 유/무선 통신 네트워크(300)는 유무선 인터넷망으로 구현될 수 있으며, 사용자 단말(100: 100-1, 100-2, ..., 100-n), 및 서버(200)가 이동 통신 단말로서 구현되는 경우 유/무선 통신 네트워크(300)는 셀룰러 네트워크 또는 WLAN(wireless local area network) 네트워크와 같은 무선망으로 구현될 수 있다.The wired/wireless communication network 300 provides a communication path through which user terminals 100: 100-1, 100-2, ..., 100-n, and the server 200 can transmit and receive signals and data to each other. to provide. The wired/wireless communication network 300 is not limited to a communication method according to a specific communication protocol, and an appropriate communication method may be used according to an implementation example. For example, when configured as an Internet protocol (IP) based system, the wired/wireless communication network 300 may be implemented as a wired/wireless Internet network, and the user terminals 100: 100-1, 100-2, ..., 100-n), and when the server 200 is implemented as a mobile communication terminal, the wired/wireless communication network 300 may be implemented as a wireless network such as a cellular network or a wireless local area network (WLAN) network. there is.

도 2는 본 발명의 다양한 실시 예들에 따른 사용자 단말의 구성에 대한 블록도를 도시한다.2 shows a block diagram of a configuration of a user terminal according to various embodiments of the present invention.

도 2를 참고하면, 본 발명의 다양한 실시 예들에 따른 사용자 단말(100: 100-1, 100-2, ..., 100-n)은 송수신기(110), 메모리(120), 프로세서(130), 입력 장치(110), 출력 장치(120), 및 카메라(160)를 포함한다.Referring to FIG. 2 , a user terminal (100: 100-1, 100-2, ..., 100-n) according to various embodiments of the present invention includes a transceiver 110, a memory 120, a processor 130 , an input device 110 , an output device 120 , and a camera 160 .

송수신기(110)는, 프로세서(130)와 연결되고 신호를 전송 및/또는 수신한다. 송수신기(110)의 전부 또는 일부는 송신기(transmitter), 수신기(receiver), 또는 트랜시버(transceiver)로 지칭될 수 있다. 송수신기(110)는 유선 접속 시스템 및 무선 접속 시스템들인 IEEE(institute of electrical and electronics engineers) 802.xx 시스템, IEEE Wi-Fi 시스템, 3GPP(3rd generation partnership project) 시스템, 3GPP LTE(long term evolution) 시스템, 3GPP 5G NR(new radio) 시스템, 3GPP2 시스템, 블루투스(bluetooth) 등 다양한 무선 통신 규격 중 적어도 하나를 지원할 수 있다.The transceiver 110 is connected to the processor 130 and transmits and/or receives signals. All or part of the transceiver 110 may be referred to as a transmitter, a receiver, or a transceiver. The transceiver 110 is a wired access system and a wireless access system, such as an institute of electrical and electronics engineers (IEEE) 802.xx system, an IEEE Wi-Fi system, a 3rd generation partnership project (3GPP) system, and a 3GPP long term evolution (LTE) system. , 3GPP 5G new radio (NR) system, 3GPP2 system, at least one of various wireless communication standards such as Bluetooth may be supported.

메모리(120)는, 송수신기(110), 메모리(120), 프로세서(130), 입력 장치(140), 출력 장치(150)와 연결되고, 입력 장치(140)를 통해 입력된 정보, 송수신기(110)의 통신을 통해 서버(200)로부터 수신한 정보 등을 저장할 수 있다. 또한, 메모리(120)는 카메라(160)에 의하여 촬영된 이미지의 정보 등을 저장할 수 있다. 또한, 메모리(120)는, 프로세서(130)와 연결되고 프로세서(130)의 동작을 위한 기본 프로그램, 응용 프로그램, 설정 정보, 프로세서(130)의 연산에 의하여 생성된 정보 등의 데이터를 저장할 수 있다. 메모리(120)는 휘발성 메모리, 비휘발성 메모리 또는 휘발성 메모리와 비휘발성 메모리의 조합으로 구성될 수 있다. 그리고, 메모리(120)는 프로세서(130)의 요청에 따라 저장된 데이터를 제공할 수 있다.The memory 120 is connected to the transceiver 110, the memory 120, the processor 130, the input device 140, and the output device 150, information input through the input device 140, the transceiver 110 ) It is possible to store information received from the server 200 through communication. Also, the memory 120 may store information on an image captured by the camera 160 . In addition, the memory 120 is connected to the processor 130 and may store data such as basic programs for operation of the processor 130, application programs, setting information, and information generated by operation of the processor 130. . The memory 120 may include volatile memory, non-volatile memory, or a combination of volatile and non-volatile memories. Also, the memory 120 may provide stored data according to a request of the processor 130 .

프로세서(130)는, 본 발명에서 제안한 절차 및/또는 방법들을 구현하도록 구성될 수 있다. 프로세서(130)는 사용자 단말(100)의 전반적인 동작들을 제어한다. 예를 들어, 프로세서(130)는 송수신기(110)를 통해 정보 등을 전송 또는 수신한다. 또한, 프로세서(130)는 메모리(120)에 데이터를 기록하고, 읽는다. 또한, 프로세서(130)는 입력 장치(140)를 통해 정보를 입력 받는다. 또한, 프로세서(130)는 출력 장치(140)를 통해 정보를 출력한다. 또한, 프로세서(130)는 카메라(160)를 통해 이미지를 촬영한다. 프로세서(130)는 적어도 하나의 프로세서(processor)를 포함할 수 있다.The processor 130 may be configured to implement the procedures and/or methods proposed in the present invention. The processor 130 controls overall operations of the user terminal 100 . For example, the processor 130 transmits or receives information or the like through the transceiver 110 . Also, the processor 130 writes data to and reads data from the memory 120 . Also, the processor 130 receives information through the input device 140 . Also, the processor 130 outputs information through the output device 140 . Also, the processor 130 captures an image through the camera 160 . The processor 130 may include at least one processor.

입력 장치(140)는, 프로세서(130)와 연결되고 정보 등을 입력할 수 있다. 일 실시 예에 따라서, 입력 장치(140)는 송수신기(130)를 통해 유/무선 통신 네트워크(300)로 연결된 다른 장치로부터 수신한 정보 등을 입력할 수 있다. 입력 장치(140)는 터치 디스플레이, 키 패드, 키보드 등을 포함할 수 있다.The input device 140 is connected to the processor 130 and may input information and the like. According to an embodiment, the input device 140 may input information received from another device connected to the wired/wireless communication network 300 through the transceiver 130 . The input device 140 may include a touch display, a keypad, and a keyboard.

출력 장치(150)는, 프로세서(130)와 연결되고 정보 등을 영상/음성 등의 형태로 출력할 수 있다. 일 실시 예에 따라서, 출력 장치(150)는 송수신기(110)를 통해 유/무선 통신 네트워크(300)로 연결된 다른 장치로부터 수신한 정보 등을 출력할 수 있다. 출력 장치(150)는 디스플레이, 스피커 등을 포함할 수 있다.The output device 150 is connected to the processor 130 and can output information in the form of video/audio. According to an embodiment, the output device 150 may output information received from another device connected to the wired/wireless communication network 300 through the transceiver 110 . The output device 150 may include a display, a speaker, and the like.

카메라(160)는, 프로세서(130)와 연결되고 전방의 오브젝트에 대하여 이미지의 촬영을 수행할 수 있다.The camera 160 is connected to the processor 130 and may capture an image of a front object.

도 3은 본 발명의 다양한 실시 예들에 따른 서버의 구성에 대한 블록도를 도시한다.Figure 3 shows a block diagram of the configuration of a server according to various embodiments of the present invention.

도 3을 참고하면, 본 발명의 다양한 실시 예들에 따른 서버(200)는 송수신기(210), 메모리(220) 및 프로세서(230)를 포함한다.Referring to FIG. 3 , a server 200 according to various embodiments of the present disclosure includes a transceiver 210 , a memory 220 and a processor 230 .

송수신기(210)는, 프로세서(230)와 연결되고 신호를 전송 및/또는 수신한다. 송수신기(210)의 전부 또는 일부는 송신기(transmitter), 수신기(receiver), 또는 트랜시버(transceiver)로 지칭될 수 있다. 송수신기(210)는 유선 접속 시스템 및 무선 접속 시스템들인 IEEE(institute of electrical and electronics engineers) 802.xx 시스템, IEEE Wi-Fi 시스템, 3GPP(3rd generation partnership project) 시스템, 3GPP LTE(long term evolution) 시스템, 3GPP 5G NR(new radio) 시스템, 3GPP2 시스템, 블루투스(bluetooth) 등 다양한 무선 통신 규격 중 적어도 하나를 지원할 수 있다.The transceiver 210 is connected to the processor 230 and transmits and/or receives signals. All or part of the transceiver 210 may be referred to as a transmitter, a receiver, or a transceiver. The transceiver 210 is a wired access system and a wireless access system, such as an institute of electrical and electronics engineers (IEEE) 802.xx system, an IEEE Wi-Fi system, a 3rd generation partnership project (3GPP) system, and a 3GPP long term evolution (LTE) system. , 3GPP 5G new radio (NR) system, 3GPP2 system, at least one of various wireless communication standards such as Bluetooth may be supported.

메모리(220)는, 송수신기(220)와 연결되고, 송수신기(220)의 통신을 통해 사용자 단말(100: 100-1, 100-2, ..., 100-n), 또는 서비스 단말(300: 300-1, 300-2, ..., 300-n)로부터 수신한 정보 등을 저장할 수 있다. 또한, 메모리(220)는, 프로세서(230)와 연결되고 프로세서(230)의 동작을 위한 기본 프로그램, 응용 프로그램, 설정 정보, 프로세서(230)의 연산에 의하여 생성된 정보 등의 데이터를 저장할 수 있다. 메모리(220)는 휘발성 메모리, 비휘발성 메모리 또는 휘발성 메모리와 비휘발성 메모리의 조합으로 구성될 수 있다. 그리고, 메모리(220)는 프로세서(230)의 요청에 따라 저장된 데이터를 제공할 수 있다.The memory 220 is connected to the transceiver 220, and through communication of the transceiver 220, a user terminal 100: 100-1, 100-2, ..., 100-n, or a service terminal 300: Information received from 300-1, 300-2, ..., 300-n) may be stored. In addition, the memory 220 is connected to the processor 230 and may store data such as basic programs for operation of the processor 230, application programs, setting information, and information generated by operation of the processor 230. . The memory 220 may include volatile memory, non-volatile memory, or a combination of volatile and non-volatile memories. Also, the memory 220 may provide stored data according to a request of the processor 230 .

프로세서(230)는, 본 발명에서 제안한 절차 및/또는 방법들을 구현하도록 구성될 수 있다. 프로세서(230)는 서버(200)의 전반적인 동작들을 제어한다. 예를 들어, 프로세서(230)는 송수신기(210)를 통해 정보 등을 전송 또는 수신한다. 또한, 프로세서(230)는 메모리(220)에 데이터를 기록하고, 읽는다. 프로세서(230)는 적어도 하나의 프로세서(processor)를 포함할 수 있다.The processor 230 may be configured to implement the procedures and/or methods proposed in the present invention. The processor 230 controls overall operations of the server 200 . For example, the processor 230 transmits or receives information or the like through the transceiver 210 . Also, the processor 230 writes data to and reads data from the memory 220 . The processor 230 may include at least one processor.

도 4는 본 발명의 다양한 실시 예들에 따라서 사용자 단말이 메타버스 가상 현실 속 반려 동물 아바타에 대한 행위 지시를 음성 메시지로 수행하기 위한 과정의 일 예를 도시한다.4 illustrates an example of a process for a user terminal to perform an action instruction for a companion animal avatar in a metaverse virtual reality as a voice message according to various embodiments of the present disclosure.

S401 단계에서, 사용자는 사용자의 단말 내 출력 장치에 표시된 메타버스 가상 현실 속 반려 동물 아바타에 대한 지시의 음성 메시지를 말할 수 있다. 단말의 입력 장치에 포함된 마이크를 통하여 단말은 사용자의 음성을 수신할 수 있다.In step S401, the user may speak a voice message instructing the companion animal avatar in the metaverse virtual reality displayed on the output device in the user's terminal. The terminal may receive the user's voice through a microphone included in an input device of the terminal.

S402 단계에서, 단말은 수신한 음성을 STT(speech to text)로 변환하여, 음성 메시지를 문자 메시지로 변환하고, 기존의 대화 셋과 비교하여 새로 수신한 사용자의 메시지를 인식할 수 있다.In step S402, the terminal converts the received voice into speech to text (STT), converts the voice message into a text message, and recognizes the newly received message of the user by comparing it with the existing conversation set.

S403 단계에서, 단말로부터 서버에 음성 메시지가 전송되고, 사용자의 메시지에 따른 지시 값과 연결된 반려 동물 아바타의 행동 값의 연결이 수행된다. 즉, 사용자의 지시에 적합한 반려 동물 아바타의 동작이 결정될 수 있다.In step S403, a voice message is transmitted from the terminal to the server, and the instruction value according to the user's message is connected to the action value of the companion animal avatar. That is, an operation of the companion animal avatar suitable for the user's instruction may be determined.

S404 단계에서, 서버는 단말에게 반려 동물 아바타에 대하여 최종적으로 결정한 행동 값에 대응하는 동작 정보 및 해당하는 유저 인터페이스 정보를 단말에게 전송한다. 단말은 서버로부터 수신한 반려 동물 아바타의 동작 정보에 기반하여 출력 장치에서 동작하는 반려 동물 아바타의 화면을 출력할 수 있다. 사용자는 이를 통해 음성으로 지시한 사항에 대하여 가상 현실 속 반려 동물 아바타가 반응을 한다고 느낄 수 있다.In step S404, the server transmits motion information corresponding to the finally determined action value of the companion animal avatar and corresponding user interface information to the terminal. The terminal may output the screen of the companion animal avatar operating in the output device based on motion information of the companion animal avatar received from the server. Through this, the user can feel that the companion animal avatar in virtual reality responds to the instructions given by voice.

이하 도 5 내지 도 13은, 사용자 단말의 출력 장치에서 반려 동물 아바타가 표시된 화면을 바라보는 사용자의 안면의 화상 이미지에 기반하여, 사용자의 감정 정보를 단말의 메모리 내 저장된 기계 학습 모델을 이용하여 인식하는 과정의 일 예를 도시한다. 단말은 단말 내 전면 카메라를 이용하여 단말의 화면을 바라보는 사용자의 안면의 화상 이미지를 획득할 수 있다.5 to 13, based on an image of a user's face looking at a screen on which a companion animal avatar is displayed on an output device of a user terminal, the user's emotion information is recognized using a machine learning model stored in the terminal's memory. An example of the process is shown. The terminal may obtain an image image of the user's face looking at the screen of the terminal using a front camera in the terminal.

도 5는 본 발명의 다양한 실시 예들에 따른 기계 학습 모델의 구조를 도시한다.5 illustrates the structure of a machine learning model according to various embodiments of the present invention.

본 발명의 다양한 실시 예들에 따른 생체 정보의 기계 학습 분석에 기반하여 컨텐츠를 제공하기 위한 다층 인공 신경망(multi-layer perceptron, MLP)의 구조를 도시한다.The structure of a multi-layer perceptron (MLP) for providing content based on machine learning analysis of biometric information according to various embodiments of the present invention is shown.

심층 학습(deep learning)은 최근 기계 학습 분야에서 대두되고 있는 기술 중 하나로써, 복수 개의 은닉 계층(hidden layer)과 이들에 포함되는 복수 개의 유닛(hidden unit)으로 구성되는 신경망(neural network)이다. 심층 학습 모델에 기본 특성(low level feature)들을 입력하는 경우, 이러한 기본 특성들이 복수 개의 은닉 계층을 통과하면서 예측하고자 하는 문제를 보다 잘 설명할 수 있는 상위 레벨 특성(high level feature)로 변형된다. 이러한 과정에서 전문가의 사전 지식 또는 직관이 요구되지 않기 때문에 특성 추출에서의 주관적 요인을 제거할 수 있으며, 보다 높은 일반화 능력을 갖는 모델을 개발할 수 있게 된다. 나아가, 심층 학습의 경우 특징 추출과 모델 구축이 하나의 세트로 구성되어 있기 때문에 기존의 기계학습 이론들 대비 보다 단순한 과정을 통하여 최종 모델을 형성할 수 있다는 장점이 있다.Deep learning, as one of the emerging technologies in the field of machine learning, is a neural network composed of a plurality of hidden layers and a plurality of hidden units included in them. When low-level features are input to a deep learning model, these basic features are transformed into high-level features that can better explain the problem to be predicted while passing through a plurality of hidden layers. In this process, since prior knowledge or intuition of an expert is not required, subjective factors in feature extraction can be removed, and a model with higher generalization ability can be developed. Furthermore, in the case of deep learning, since feature extraction and model construction are composed of one set, there is an advantage in that the final model can be formed through a simpler process compared to existing machine learning theories.

다층 인공 신경망(multi-layer perceptron, MLP)은 심층 학습에 기반하여 여러 개의 노드가 있는 인공 신경망(artificial neural network, ANN)의 한 종류이다. 각 노드는 동물의 연결 패턴과 유사한 뉴런으로 비선형 활성화 기능을 사용한다. 이 비선형 성질은 분리할 수 없는 데이터를 선형적으로 구분할 수 있게 한다.A multi-layer perceptron (MLP) is a type of artificial neural network (ANN) with multiple nodes based on deep learning. Each node uses a non-linear activation function with neurons similar to animal connection patterns. This nonlinear property makes it possible to linearly distinguish inseparable data.

도 5를 참고하면, 본 발명의 다양한 실시 예들에 따른 MLP 모델의 인공 신경망(500)은 하나 이상의 입력 계층(input layer)(510), 복수 개의 은닉 계층(hidden layer)(530), 하나 이상의 출력 계층(output layer)(550)으로 구성된다. Referring to FIG. 5 , the artificial neural network 500 of the MLP model according to various embodiments of the present invention includes one or more input layers 510, a plurality of hidden layers 530, and one or more outputs. It consists of an output layer (550).

입력 계층(510)의 노드에는 단위 시간별 적어도 하나의 초음파 이미지 내 각각의 픽셀의 RGB 값과 같은 입력 데이터가 입력된다. 여기서, 사용자의 생체 정보, 예를 들어, 심전도 정보, 집중도 수치, 행복 감정 비율의 정보, 및, 조정 컨텐츠의 정보, 예를 들어, 컨텐츠 장르, 컨텐츠 주제, 컨텐츠 채널의 정보 각각(511)은 심층 학습 모델의 기본 특성(low level feature)에 해당한다.Input data such as an RGB value of each pixel in at least one ultrasound image per unit time is input to a node of the input layer 510 . Here, the user's biometric information, eg, electrocardiogram information, concentration level, happiness emotion rate information, and adjusted content information, eg, content genre, content topic, and content channel information 511 are deep It corresponds to the basic characteristic (low level feature) of the learning model.

은닉 계층(530)의 노드에서는 입력된 인자들에 기초한 계산이 이루어진다. 은닉 계층(530)은 사용자의 생체 정보 및 조정 컨텐츠의 정보(511)를 규합시켜 형성된 복수 개의 노드로 정의되는 유닛들이 저장된 계층이다. 은닉 계층(530)은 도 5에 도시된 바와 같이 복수 개의 은닉 계층으로 구성될 수 있다. A node of the hidden layer 530 performs calculations based on input factors. The hidden layer 530 is a layer in which units defined by a plurality of nodes formed by integrating the user's biometric information and the information 511 of the adjusted content are stored. As shown in FIG. 5 , the hidden layer 530 may include a plurality of hidden layers.

예를 들어, 은닉 계층(530)이 제1 은닉 계층(531) 및 제2 은닉 계층(533)으로 구성될 경우, 제1 은닉 계층(531)은 사용자의 생체 정보 및 조정 컨텐츠의 정보(511)를 규합시켜 형성된 복수 개의 노드로 정의되는 제1 유닛들(532)이 저장되는 계층으로서, 제1 유닛(532)은 사용자의 생체 정보 및 조정 컨텐츠의 정보(511)의 상위 특징에 해당된다. 제2 은닉 계층(533)은 제1 은닉 계층(531)의 제1 유닛들을 규합시켜 형성된 복수 개의 노드로 정의되는 제2 유닛들(534)이 저장되는 계층으로, 제2 유닛(534)은 제1 유닛(532)의 상위 특징에 해당된다.For example, when the hidden layer 530 is composed of a first hidden layer 531 and a second hidden layer 533, the first hidden layer 531 includes user biometric information and adjusted content information 511 As a layer in which first units 532 defined by a plurality of nodes formed by consolidating are stored, the first unit 532 corresponds to a higher characteristic of the user's biometric information and the information 511 of the adjusted content. The second hidden layer 533 is a layer in which second units 534, defined as a plurality of nodes formed by consolidating the first units of the first hidden layer 531, are stored. Corresponds to the upper characteristics of 1 unit 532.

출력 계층(550)의 노드에서는 계산된 예측 결과를 나타낸다. 출력 계층(550)에는 복수 개의 예측 결과 유닛들(551)이 구비될 수 있다. 구체적으로 복수 개의 예측 결과 유닛들(551)은 참(true) 유닛 및 거짓(false) 유닛의 두 개의 유닛들로 구성될 수 있다. 구체적으로, 참 유닛은 조정 컨텐츠로의 컨텐츠 조정 후 사용자의 생체 정보 중 집중도 수치와 행복 감정 비율이 임계 집중도 수치 및 임계 행복 감정 비율 이상일 가능성이 높다는 의미를 지닌 예측 결과 유닛이고, 거짓 유닛은 조정 컨텐츠로의 컨텐츠 조정 후 사용자의 생체 정보 중 집중도 수치와 행복 감정 비율이 임계 집중도 수치 및 임계 행복 감정 비율 이상일 가능성이 낮다는 의미를 지닌 예측 결과 유닛이다.Nodes of the output layer 550 represent calculated prediction results. The output layer 550 may include a plurality of prediction result units 551 . Specifically, the plurality of prediction result units 551 may include two units of a true unit and a false unit. Specifically, the true unit is a prediction result unit that means that the concentration value and happiness emotion rate among the user's biometric information are highly likely to be equal to or greater than the critical concentration value and threshold happiness emotion rate after adjusting the content to the adjusted content, and the false unit is the adjusted content content It is a prediction result unit that means that it is unlikely that the concentration value and the happiness emotion rate among the user's biometric information are higher than the critical concentration value and the critical happiness emotion rate after adjusting the content of Rho.

은닉 계층(530) 중 마지막 계층인 제2 은닉 계층(533)에 포함된 제2 유닛들(534)과 예측 결과 유닛들(551) 간의 연결에 대하여 각각의 가중치들이 부여되게 된다. 이러한 가중치에 기초하여 조정 컨텐츠로의 컨텐츠 조정 후 사용자의 생체 정보 중 집중도 수치와 행복 감정 비율이 임계 집중도 수치 및 임계 행복 감정 비율 이상일지 여부를 예측하게 된다. Weights are assigned to connections between the prediction result units 551 and the second units 534 included in the second hidden layer 533, which is the last layer among the hidden layers 530. Based on these weights, it is predicted whether the concentration value and the happy emotion ratio among the biometric information of the user after adjusting the content to the adjusted content are greater than or equal to the critical concentration value and the threshold happiness emotion rate.

예를 들어, 제2 유닛(534) 중 어느 하나의 유닛이 조정 컨텐츠로의 컨텐츠 조정 후 사용자의 생체 정보 중 집중도 수치와 행복 감정 비율을 임계 집중도 수치 및 임계 행복 감정 비율 이상으로 예측하는 경우 참 유닛 및 거짓 유닛과 각각 연결되는데, 참 유닛 과의 연결에 대해서는 양의 값을 갖는 가중치가 부여될 것이고, 거짓 유닛과의 연결에 대해서는 음의 값을 갖는 가중치가 부여될 것이다. 반대로, 제2 유닛(534) 중 어느 하나의 유닛이 조정 컨텐츠로의 컨텐츠 조정 후 사용자의 생체 정보 중 집중도 수치와 행복 감정 비율을 임계 집중도 수치 및 임계 행복 감정 비율 미만으로 예측하는 경우 참 유닛 및 거짓 유닛과 각각 연결되는데, 참 유닛 과의 연결에 대해서는 음의 값을 갖는 가중치가 부여될 것이고, 거짓 유닛과의 연결에 대해서는 양의 값을 갖는 가중치가 부여될 것이다.For example, if any one of the second units 534 predicts a concentration value and a happiness emotion rate among the user's biometric information to be greater than or equal to a threshold concentration value and a threshold happiness emotion rate after adjusting the content to the adjusted content, a true unit. and a false unit, respectively. A positive weight will be assigned to the connection to the true unit, and a negative weight will be assigned to the connection to the false unit. Conversely, if any one of the second units 534 predicts the concentration value and the happiness emotion rate of the user's biometric information to be less than the threshold concentration value and the threshold happiness emotion rate after content adjustment to the adjusted content, true unit and false unit Each unit is connected, and a negative weight is assigned to a connection to a true unit, and a positive weight is assigned to a connection to a false unit.

복수 개의 제2 유닛들(534)과 참 유닛 사이에는 복수 개의 연결선들이 형성될 것이다. 복수 개의 연결선들의 총 합이 양의 값을 갖는 경우, 입력 계층(510)에서의 사용자의 생체 정보 및 조정 컨텐츠의 정보(511)은 조정 컨텐츠로의 컨텐츠 조정 후 사용자의 생체 정보 중 집중도 수치와 행복 감정 비율이 임계 집중도 수치 및 임계 행복 감정 비율 이상인 인자들로 예측될 것이다. 일 실시 예에 따라서, 조정 컨텐츠로의 컨텐츠 조정 후 사용자의 생체 정보 중 집중도 수치와 행복 감정 비율이 임계 집중도 수치 및 임계 행복 감정 비율 이상인지 여부는 복수 개의 연결선들의 총 합과 미리 설정된 값을 비교하여 예측할 수도 있다.A plurality of connection lines may be formed between the plurality of second units 534 and the charm unit. When the total sum of the plurality of connection lines has a positive value, the information 511 of the user's biometric information and the adjusted content in the input layer 510 is the concentration value and happiness among the user's biometric information after adjusting the content to the adjusted content. The emotion rate will be predicted with factors that are greater than or equal to the critical concentration value and the critical happiness emotion rate. According to an embodiment, whether the concentration value and the happiness emotion rate of the user's biometric information are equal to or greater than the critical concentration value and the threshold happiness emotion rate after adjusting the content to the adjusted content is compared with the total sum of a plurality of connection lines and a preset value. can also be predicted.

MLP 모델의 인공 신경망(500)은 학습 파라미터들을 조정하여 학습한다. 일 실시 예에 따라서, 학습 파라미터들은 가중치 및 편차 중 적어도 하나를 포함한다. 학습 파라미터들은 기울기 하강법(gradient descent)이라는 최적화 알고리즘을 통해 반복적으로 조정된다. 주어진 데이터 샘플로부터 예측 결과가 계산될 때마다(순방향 전파, forward propagation), 예측 오류를 측정하는 손실 함수를 통해 네트워크의 성능이 평가된다. 인공 신경망(500)의 각 학습 파라미터는 손실 함수의 값을 최소화하는 방향으로 조금식 증가하여 조정되는데, 이 과정은 역 전파(back-propagation)라고 한다.The artificial neural network 500 of the MLP model learns by adjusting learning parameters. According to one embodiment, the learning parameters include at least one of a weight and a variance. The learning parameters are iteratively adjusted through an optimization algorithm called gradient descent. Each time a prediction result is computed from a given data sample (forward propagation), the performance of the network is evaluated through a loss function that measures the prediction error. Each learning parameter of the artificial neural network 500 is adjusted by gradually increasing in the direction of minimizing the value of the loss function, and this process is called back-propagation.

상기와 같은 모델을 통해 조정 컨텐츠로의 컨텐츠 조정 후 사용자의 생체 정보 중 집중도 수치와 행복 감정 비율이 임계 집중도 수치 및 임계 행복 감정 비율 이상인지 여부를 예측하고, 적절한 조정 컨텐츠를 결정할 수 있다.After adjusting the content to the adjusted content through the above model, it is possible to predict whether or not the concentration value and the happiness emotion rate among the user's biometric information are greater than or equal to the critical concentration value and the threshold happiness emotion rate, and appropriately adjusted content can be determined.

도 6은 본 발명의 다양한 실시 예들에 따른 생체 정보 획득 과정의 일 예를 도시한다.6 illustrates an example of a biometric information acquisition process according to various embodiments of the present disclosure.

도 6을 참조하면, 사용자의 얼굴을 포함하는 상반신에 대한 화상 이미지로부터 사용자의 생체 정보를 획득하는 과정의 일 예를 도시한다.Referring to FIG. 6, an example of a process of obtaining biometric information of a user from a picture image of the upper body including the user's face is shown.

구체적으로, 단말은 화상 이미지 중 얼굴 내 특정 감정 점들의 패턴, 상반신 내 특정 움직임 점들의 패턴을 분석함으로써 사용자의 감정 종류, 감정 비율의 정보를 획득할 수 있다. 보다 상세하게는, 사용자로부터 기본 표정(무표정)과 이에 비교되는 다수의 감정상태에 따르는 감정 표정의 안면 영상을 획득하고, 기본 표정 또는 감정 표현의 안면 영상에, 안면 부위별 기준 정점을 정의하는 표준 모델을 마스킹(매칭)하여 사용자의 안면 영상에 안면 부위별로 다수의 정점들을 맵핑한 이후에 기본 표정과 감정 표현의 안면 영상에서 동일 안면 부위에서의 정점간의 좌표 변화 값을 추출하고 상기 정점들을 인체 안면에 정의되어 있는 표정단위(Action Unit)으로 별로 그룹화하고, 표정단위 그룹별 정점들의 중점(Centroid)의 변화 값을 포함하는 표정단위 안면 움직임 정보를 추출하고 추출한 안면 움직임 정보를 다수의 감정 상태 별로 데이터 베이스화 하여 감정의 종류와 감정 비율의 정보를 획득할 수 있다.Specifically, the terminal may obtain information on the user's emotion type and emotion rate by analyzing a pattern of specific emotion points in the face and a pattern of specific movement points in the upper body of the video image. More specifically, a standard for obtaining a facial image of a basic expression (non-expression) and an emotional expression according to a number of emotional states compared thereto from the user, and defining a reference vertex for each facial part in the facial image of the basic expression or emotional expression. After masking (matching) the model and mapping a plurality of vertices for each facial part to the user's face image, coordinate change values between vertices in the same facial part are extracted from the facial image of the basic expression and emotional expression, and the vertices are compared to the human face group by expression unit (Action Unit) defined in, and extract the facial movement information of the expression unit including the change value of the centroid of the vertices of each expression unit group, and extract the extracted facial movement information as data for each emotional state Information on the type of emotion and the emotion rate can be obtained by making it a base.

이 때, 표정단위 안면 움직임 정보는 행복(happy), 중립(neutral), 분노(angry), 공포(fear), 놀람(surprise), 슬픔(sad), 혐오(disgust) 등에 대한 정보를 포함할 수 있다. At this time, the facial movement information in each expression unit may include information on happiness, neutral, anger, fear, surprise, sadness, disgust, and the like. there is.

도 7은 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.7 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.

본 발명의 다양한 실시 예들에서, 감정과 감성은 서로 대체할 수 있는 용어, 즉, emotion의 동일한 의미를 지칭하는 용어로 사용한다.In various embodiments of the present invention, emotion and emotion are used as interchangeable terms, that is, terms that refer to the same meaning of emotion.

본 발명의 다양한 실시 예들에 따르면, 단말은 i) 화상 이미지 중 얼굴 검출, ii) 얼굴 포인트 검출, iii) 얼굴 근육 정의, iv) 근육 움직임 추적, v) 유효 감성 파라미터 추출, vi) 룰 베이스 구축, vii) 감성 인식을 수행할 수 있다.According to various embodiments of the present disclosure, the terminal i) detects a face in a visual image, ii) detects a face point, iii) defines a facial muscle, iv) tracks a muscle movement, v) extracts an effective emotion parameter, vi) builds a rule base, vii) Emotion recognition can be performed.

i) 얼굴 검출에 대하여, 단말은 얼굴 특징 기반 Haar 모델을 이용한 Viola-Jones 알고리즘을 이용한다. 이에 따라, 얼굴 특징 기반으로 기존 색상 기반 모델에 비해 조명으로 인한 항계를 극복하고, 얼굴 모델 기반의 통계적 접근법인 AAM 기반 방법보다 적은 컴퓨팅 비용을 보여 실시간 프로세스에 적용 가능하다.i) For face detection, the terminal uses the Viola-Jones algorithm using the facial feature-based Haar model. Accordingly, compared to existing color-based models based on facial features, it overcomes the limit caused by lighting and shows lower computing cost than AAM-based methods, which are statistical approaches based on facial models, and can be applied to real-time processes.

ii) 얼굴 포인트 검출에 대하여, 단말은 DLIB 기반 Regression Trees 앙상블 알고리즘을 사용하여, 눈, 코, 입에 대한 얼굴 포인트 68개를 검출할 수 있다.ii) Regarding face point detection, the terminal can detect 68 face points for the eyes, nose, and mouth using a DLIB-based Regression Trees ensemble algorithm.

iii) 얼굴 근육 정의에 대하여, 단말은, facial action coding system에 따른 얼굴 포인트 기반 얼굴 근육 정의에 따라서, 표정에 민감한 얼굴 근육 39개를 정의할 수 있다.iii) Regarding facial muscle definition, the terminal may define 39 facial muscles sensitive to expressions according to facial point-based facial muscle definition according to the facial action coding system.

iv) 근육 움직임 추적에 대하여, 단말은 영상 프레임에 따른 얼굴 근육의 움직임을 추적하고, 표정에 따른 얼굴 근육의 XY 좌표 및 면적 변화량을 추적할 수 있다.iv) Regarding muscle movement tracking, the terminal may track the movement of the facial muscles according to the image frame, and may track the XY coordinates of the facial muscles and the amount of change in area according to facial expressions.

v) 유효 감성 파라미터 추출에 대하여, 단말은 6가지 기본 감성에 대한 얼굴 근육의 움직임 특징 기반 유효 감성 파라미터를 추출하여, 행복, 슬픔, 놀람, 화남, 역겨움, 두려움의 감정을 추출할 수 있다.v) Regarding effective emotion parameter extraction, the terminal may extract emotions such as happiness, sadness, surprise, anger, disgust, and fear by extracting effective emotion parameters based on facial muscle movement characteristics for six basic emotions.

v) 룰 베이스 구축에 대하여, 단말은 실제 얼굴 표정 데이터로부터 감성 인식을 위한 룰 베이스를 구축하여, 보편적 중립 얼굴 모델을 이용해 개인화 문제를 극복하고, 통계 분석 기반 룰 베이스 구축을 통해 감성 인식을 위한 얼굴 모델을 구축할 수 있다.v) Regarding rule base construction, the terminal builds a rule base for emotion recognition from actual facial expression data, overcomes the personalization problem using a universal neutral face model, and builds a rule base based on statistical analysis. Face for emotion recognition model can be built.

vi) 감성 인식에 대하여, 단말은 Fuzzy 기반 감성 인식 알고리즘을 이용해, 룰 베이스 기반 각성 별 확률을 계산하고, 각 감성의 확률적 접근을 통해 감성 인식의 민감성을 최소화할 수 있다.vi) Regarding emotion recognition, the terminal can use a fuzzy-based emotion recognition algorithm to calculate a probability for each awakening based on a rule, and minimize the sensitivity of emotion recognition through a probabilistic approach of each emotion.

도 8은 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.8 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.

구체적으로, 도 8은 사용자의 얼굴을 포함한 상반신에 대한 화상 이미지로부터 얼굴 이미지 데이터를 추출하는 과정을 도시한다.Specifically, FIG. 8 illustrates a process of extracting face image data from an image of the upper body including the user's face.

우선, 단말은 Haar 특징 추출을 수행한다. 구체적으로, 단말은 눈, 코와 같은 얼굴 특징에 해당하는 커널을 이용해 이미지의 각 부분을 픽셀 단위로 비교하여 입력 영상 이미지로부터 얼굴이 존재할 확률을 계산한다. 단말은 입력 영상에서 검정 영역과 흰색 영역에 해당하는 밝기 값을 빼서 임계 값 이상인 것을 찾는다.First, the terminal performs Haar feature extraction. Specifically, the terminal compares each part of the image pixel by pixel using a kernel corresponding to facial features such as eyes and nose, and calculates a probability that a face exists from an input video image. The terminal subtracts the brightness values corresponding to the black and white areas from the input image and finds something that is above the threshold value.

다음으로, 단말은 캐스케이드(cascade) 분류를 수행한다. 구체적으로, 단말은 Haar 특징으로 얼굴 여부를 인식할 수 있는 캐스케이드 분류기를 사용하여 얼굴을 검출한다. 도 8의 실시 예에서는 미리 트레이닝된 Frontalface_alt2 모델을 적용하였다. 단말은 눈과 같은 확실한 Haar 특징으로부터 후보를 축소한 후, 자세한 얼굴 요소의 Haar 특징으로 최종 얼굴을 검출한다.Next, the UE performs cascade classification. Specifically, the terminal detects a face by using a cascade classifier capable of recognizing whether or not a face is a face with a Haar feature. In the embodiment of FIG. 8 , a previously trained Frontalface_alt2 model was applied. After reducing candidates from certain Haar features such as eyes, the terminal detects the final face with Haar features of detailed face elements.

다음으로, 단말은 랜드마크를 검출한다. 도 8의 실시 예에서는, 단말이 iBUG 300-W 데이터셋으로 트레이닝 된 DLIB 기반 Regression Trees 앙상블 모델을 적용하였다. Haar 모델로부터 검출된 정면 얼굴에 대해 눈, 코, 입, 눈썹, 턱 등에 대한 얼굴 포인트 68개를 검출한다. 각 얼굴 포인트는 2차원 공간에서 x, y 좌표 값을 지니고 있다.Next, the terminal detects the landmark. In the embodiment of FIG. 8, the UE applied the DLIB-based Regression Trees ensemble model trained with the iBUG 300-W dataset. For the frontal face detected from the Haar model, 68 face points for eyes, nose, mouth, eyebrows, and chin are detected. Each face point has x, y coordinate values in a two-dimensional space.

다음으로, 단말은 좌표계 정규화를 수행한다. 구체적으로, 검출된 얼굴 포인트는 화면의 왼쪽 상단을 원점으로 하는 좌표계에 위치하고 있다. 이는 동일한 표정임에도 불구하고 얼굴의 위치에 따라 다른 값을 보일 수 있어 노이즈로 작용한다. 따라서 미간을 원점으로 하는 상대좌표계로 정규화하여 이러한 문제를 해결할 수 있다.Next, the terminal performs coordinate system normalization. Specifically, the detected face points are located in a coordinate system with the upper left corner of the screen as the origin. This acts as noise because it can show different values depending on the position of the face even though it is the same expression. Therefore, this problem can be solved by normalizing to a relative coordinate system with the eyebrows as the origin.

도 9는 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.9 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.

구체적으로, 도 9는 AU(action unit)의 명칭 별 얼굴 포인트 인덱스를 도시한다.Specifically, FIG. 9 shows a face point index for each name of an action unit (AU).

Facial Action Coding System(FACS)에 따른 얼굴 포인트 기반 얼굴 근육 정의는 다음과 같다. Facial Action Coding System(FACS)은 감성을 표현하는데 사용되는 얼굴 근육의 움직임을 정의한 기준 시스템이다 (Ekman P, 1978). 이에 대응하는 얼굴 근육을 분석하기 위해 얼굴 포인트 68개를 기반으로 얼굴 근육(Action Unit) 39개를 정의한다. 이 때, 얼굴의 해부학적 요인을 고려하여 FACS에서 정의된 근육의 움직임을 적합하게 표현할 수 있도록 고려한다.Facial muscle definition based on facial points according to the Facial Action Coding System (FACS) is as follows. The Facial Action Coding System (FACS) is a standard system that defines the movements of facial muscles used to express emotions (Ekman P, 1978). To analyze the corresponding facial muscles, 39 facial muscles (action units) are defined based on 68 facial points. At this time, the anatomical factors of the face are taken into consideration so that the movements of the muscles defined in the FACS can be expressed appropriately.

도 10은 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.10 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.

구체적으로, 도 10은 폴리곤(polygon) 형태의 얼굴 근육의 예시를 도시한다.Specifically, FIG. 10 shows an example of a facial muscle in the form of a polygon.

본 발명의 다양한 실시 예들에 따르면, 단말은 얼굴 근육 특징 추출할 수 있다. 구체적으로, 단말은 얼굴 포인트의 XY 좌표로부터 얼굴 근육의 면적과 중심 x, y 좌표를 계산하고, 정형화되지 않은 얼굴 근육의 Polygon 형태를 고려하여 면적을 계산하며, 중심 x, y 좌표 또한 Polygon 형태를 고려하여 무게중심을 계산할 수 있다.According to various embodiments of the present disclosure, the terminal may extract facial muscle features. Specifically, the terminal calculates the area and center x, y coordinates of the facial muscles from the XY coordinates of the facial points, calculates the area by considering the polygon shape of the non-standard facial muscles, and calculates the area by considering the polygon shape of the center x, y coordinates as well. The center of gravity can be calculated taking into account

본 발명의 다양한 실시 예들에 따르면, 단말은 얼굴 근육의 움직임을 추적할 수 있다. 구체적으로, 단말은 얼굴 근육의 움직임을 추적하기 위해 매 프레임마다 각 근육의 면적과 중심 x, y 좌표를 추출하고, 평상시 얼굴을 기준으로 현재 얼굴과의 근육 특징의 변화량을 계산하며, 이를 이용해 영상 프레임에 따라 변화하는 근육 움직임을 추적할 수 있다.According to various embodiments of the present disclosure, a terminal may track movement of facial muscles. Specifically, the terminal extracts the area and center x, y coordinates of each muscle every frame to track the movement of the facial muscles, calculates the amount of change in muscle characteristics with the current face based on the normal face, and uses this to calculate the image It can track muscle movements that change according to the frame.

도 11은 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.11 illustrates an example of a process of recognizing a user's emotion or emotion based on a video image according to various embodiments of the present disclosure.

구체적으로, 도 11은 얼굴 근육의 움직임에 기초하여 각 감정 또는 감성의 데이터를 획득하는 과정을 도시한다.Specifically, FIG. 11 illustrates a process of obtaining data of each emotion or sensitivity based on facial muscle movements.

본 발명의 다양한 실시 예들에 따르면, 단말은 기본 감성 또는 기본 감정에 따른 얼굴 근육 움직임 특징을 결정할 수 있다. FACS에 의해 정의된 감성에 따른 얼굴 근육의 움직임 특징을 정의한다. 6가지 기본감성인 행복, 슬픔, 놀람, 화남, 역겨움, 두려움을 모두 고려한다.According to various embodiments of the present disclosure, the terminal may determine basic emotions or facial muscle movement characteristics according to basic emotions. Define the movement characteristics of facial muscles according to the emotion defined by FACS. It considers all six basic emotions: happiness, sadness, surprise, anger, disgust, and fear.

도 12는 본 발명의 다양한 실시 예들에 따른 사용자의 화상 이미지에 기반한 감정 또는 감성(emotion)을 인식하는 과정의 일 예를 도시한다.12 illustrates an example of a process of recognizing a user's emotion or emotion based on a visual image according to various embodiments of the present disclosure.

구체적으로, 도 12는 감정 인식을 위한 룰 베이스를 도시한다. 도 12의 룰 베이스는 통계 분석 및 룰 베이스 구축을 통해 도출된 결과이다.Specifically, FIG. 12 shows a rule base for emotion recognition. The rule base of FIG. 12 is a result derived through statistical analysis and rule base construction.

구체적으로, 통계 분석은 다음과 같이 수행되었다. 감성 인식을 위한 룰 베이스를 구축하기 위해 실제 얼굴 표정 데이터로부터 근육 움직임 특징 변수에 대한 통계 분석을 실시하고, 62명에 대해 6가지 기본감성인 행복, 슬픔, 놀람, 화남, 역겨움, 두려움의 얼굴 표정 사진을 각각 취득하며, 평상시 사진을 포함하여 7그룹에 대한 특징 변수들의 비교를 위해 ANOVA를 실시하였다. 각 감성의 샘플의 수가 동일하기 때문에 Tukey를 이용해 사후분석을 실시하였다.Specifically, statistical analysis was performed as follows. In order to build a rule base for emotion recognition, statistical analysis was performed on muscle movement feature variables from actual facial expression data, and facial expressions of happiness, sadness, surprise, anger, disgust, and fear, six basic emotions for 62 people Each photo was acquired, and ANOVA was performed for comparison of feature variables for 7 groups including the normal photo. Since the number of samples for each emotion is the same, post-hoc analysis was performed using Tukey.

또한, 룰 베이스 구축은 다음과 같이 수행되었다. 통계 분석 결과, 감성에 따라 차이를 보이는 근육 움직임 특징 변수들을 감성 인식을 위한 룰 베이스로 구축하였다. 또한 개인마다 약간씩 차이를 보이는 얼굴 구조의 한계를 극복하기 위해 보편적 중립 얼굴 모델을 구축하였다.In addition, rule base construction was performed as follows. As a result of statistical analysis, muscle movement feature variables showing differences according to emotion were established as a rule base for emotion recognition. In addition, a universal neutral face model was constructed to overcome the limitations of the facial structure showing slight differences for each individual.

최종적으로, 본 발명의 다양한 실시 예들에 따르면, 단말은 Fuzzy 기반 감성 인식 알고리즘으로 감정을 계산한다. 구체적으로, 단말은 실제 데이터로부터 구축된 룰 베이스를 기반으로 현재 얼굴에 대해 각 감성 별로 확률을 계산하고, 최종 감성을 인식하기 위해 하나의 감성이 다른 감성들에 비해 각각 얼마나 높은 확률을 보이는지를 모두 고려하여 최종 확률을 계산한다. 예를 들어, 단말은 현재 얼굴이 행복일 확률을 계산하기 위해서 다음과 같은 조건들을 모두 고려한다. i) 현재 얼굴이 평상시와 비교해 행복일 확률은? ii) 현재 얼굴이 슬픔과 비교해 행복일 확률은? iii) 현재 얼굴이 놀람과 비교해 행복일 확률은? iv) 현재 얼굴이 화남과 비교해 행복일 확률은? v) 현재 얼굴이 역겨움과 비교해 행복일 확률은? vi) 현재 얼굴이 두려움과 비교해 행복일 확률은? 상기 사항들을 고려한 결과, 최종적으로 단말은 가장 확률이 높은 감정을 현재 감정으로 인식한다.Finally, according to various embodiments of the present invention, the terminal calculates emotion using a fuzzy-based emotion recognition algorithm. Specifically, the terminal calculates the probability for each emotion for the current face based on a rule base built from actual data, and shows how high the probability of each emotion is compared to other emotions to recognize the final emotion. to calculate the final probability. For example, the terminal considers all of the following conditions in order to calculate the probability that the current face is happy. i) What is the probability that the current face is happy compared to normal? ii) What is the probability that the current face is happy compared to sad? iii) What is the probability that the current face is happy compared to surprise? iv) What is the probability that the current face is happy compared to the angry face? v) What is the probability that the current face is happy compared to disgust? vi) What is the probability that the current face is happy compared to fear? As a result of considering the above, the terminal finally recognizes the emotion with the highest probability as the current emotion.

도 13은 본 발명의 다양한 실시 예들에 따라서 사용자의 화상 이미지로부터 기계 학습 기반으로 사용자의 감정을 인식하는 과정의 일 예를 도시한다.13 illustrates an example of a process of recognizing a user's emotion based on machine learning from a user's video image according to various embodiments of the present disclosure.

도 13을 참조하면, 사용자의 안면에 대한 화상 이미지가 INPUT으로 입력된 후, 기계 학습 모델을 거친 뒤, 최종적으로 사용자의 감정 정보가 출력되는 과정이 도시된다.Referring to FIG. 13 , a process in which a visual image of a user's face is input through INPUT, goes through a machine learning model, and finally outputs emotion information of the user is illustrated.

단말은 기계 학습 모델을 통해 사용자가 구체적으로 어떠한 감정을 가지고 있는지 분석할 수 있다. 또한, 사용자는 감정 분석을 통해 사용자가 현재 얼마나 집중도를 가지고 있는지, 예를 들어, 보통 상태인지, 집중 상태인지, 또는 몰입 상태인지 여부에 대하여, 집중도 수치를 0에서 100 사이의 값으로 산출할 수 있다. 이것은 감정의 강도와 연관될 수 있다.The terminal may analyze what emotion the user has in detail through a machine learning model. In addition, the user can calculate the level of concentration as a value between 0 and 100, for example, whether the user is in a normal state, a concentration state, or an immersion state through emotion analysis. there is. This can be related to the intensity of emotions.

도 14는 본 발명의 다양한 실시 예들에 따라서 사용자의 이미지 및 음성에서 추출한 감정으로 상호 작용 분류 별 교감도를 부여하거나 차감하는 과정의 일 예를 도시한다.FIG. 14 illustrates an example of a process of assigning or subtracting a degree of sympathy for each interaction class with emotion extracted from a user's image and voice according to various embodiments of the present disclosure.

도 14를 참조하면, 사용자와 가상 현실 속 반려 동물 아바타가 상호 작용하는 각각의 상황에 따라 분류된다. 예를 들어, 반려 동물 아바타를 분양하는 상황, 케어하는 상황, 훈련하는 상황, 놀이하는 상황이 분류된다.Referring to FIG. 14 , it is classified according to each situation in which a user and a companion animal avatar in virtual reality interact with each other. For example, a situation of selling companion animal avatars, a situation of caring for them, a situation of training, and a situation of playing are classified.

또한, 사용자와 가상 현실 속 반려 동물 아바타가 상호 작용하는 각각의 상황에 대하여 구체적인 세부 단계가 분류된다. 예를 들어, 분양에 대하여 아바타 선택 완료, 이름 설정 중 등이 분류된다. 예를 들어, 케어에 대하여, 밥 주기, 똥, 오줌 치우기, 목욕 시키기, 간호하기 등이 분류된다. 예를 들어, 훈륜에 대하여, 이름 부르기, 앉아, 일어나, 기다려, 개인기 등이 분류된다. 예를 들어, 놀이에 대하여, 산책하기, 여행하기 등이 분류된다.In addition, specific detailed steps are classified for each situation in which the user and the companion animal avatar in virtual reality interact. For example, for pre-sale, avatar selection is completed, name is being set, etc. are classified. For example, with regard to care, feeding, pooping, peeing, bathing, nursing, and the like are classified. For example, with respect to training, name calling, sitting, getting up, waiting, individual skills, and the like are classified. For example, with respect to play, walking, traveling, and the like are classified.

또한, 사용자의 얼굴 이미지, 또는 음성에 기반하여 감정 유형 또는 집중의 분석 소스가 분류된다.Also, based on the user's face image or voice, the analysis source of the emotion type or concentration is classified.

또한, 각각의 감정 유형 또는 집중에 대하여 감정의 강도 또는 집중의 강도가 분석될 수 있다.In addition, the intensity of emotion or the intensity of concentration can be analyzed for each emotion type or concentration.

또한, 분석된 사용자의 반려 동물 아바타에 대한 감정 유형, 감정의 강도, 또는 집중의 강도에 기반하여 사용자와 반려 동물 아바타의 교감도에 대한 가산 또는 차감이 결정된다.In addition, the addition or subtraction of the degree of rapport between the user and the companion animal avatar is determined based on the analyzed emotion type, emotion intensity, or concentration intensity of the user's companion animal avatar.

결정된 교감도의 가산 또는 차감은 이후 반려 동물 아바타의 사용자에 대한 충성도, 훈련 성과, 성장 속도 등 다양한 분야에 영향을 미칠 수 있다. 사용자는 본인의 감정이 반려 동물 아바타에게 영향을 미치는 모습을 보면서 반려 동물 아바타와 감정적으로 상호 작용이 이루어짐을 느낄 수 있다.Addition or subtraction of the determined sympathy may affect various fields such as loyalty to the user of the companion animal avatar, training performance, and growth rate. The user can feel the emotional interaction with the companion animal avatar while watching how their emotions affect the companion animal avatar.

도 15는 본 발명의 다양한 실시 예들에 따라서 사용자의 얼굴 인식에 따른 사용자의 감정 피드백을 반영하여 반려 동물 아바타의 동작이 결정되는 과정의 일 예를 도시한다.15 illustrates an example of a process of determining the motion of a companion animal avatar by reflecting the user's emotion feedback according to the user's face recognition according to various embodiments of the present disclosure.

도 15를 참조하면, 사용자가 단말에서 반려 동물 아바타의 애플리케이션(앱)을 사용하면, 애플리케이션 내 반려 동물 아바타는 실제 반려 동물에 대하여 주인이 찾는 경우에 대응하여 반응할 수 있다.Referring to FIG. 15 , when a user uses an application (app) of a companion animal avatar in a terminal, the companion animal avatar in the application may respond to a case in which an owner actually searches for a companion animal.

단말은 애플리케이션이 동작하면 사용자의 얼굴 이미지를 인식할 수 있다. 이후, 단말은 사용자의 얼굴 이미지 또는 음성으로부터 사용자의 반려 동물 아바타에 대한 감정 및 사용자의 반려 동물 아바타에 대한 집중 상태를 판단할 수 있다.When the application operates, the terminal may recognize the user's face image. Thereafter, the terminal may determine the user's emotion for the companion animal avatar and the concentration state for the user's companion animal avatar from the user's face image or voice.

단말은 사용자의 얼굴 이미지 또는 음성으로부터 사용자의 감정 유형, 감정 강도, 집중도를 분석하고, 사용자의 교감도 측면에서 긍정적인지 부정적인지 여부를 판단하여 사용자의 감정 정보를 생성한 후 서버에게 전송할 수 있다.The terminal may analyze the user's emotion type, emotion intensity, and concentration from the user's face image or voice, determine whether the user's empathy is positive or negative, generate the user's emotion information, and transmit it to the server.

서버에서는 단말로부터 수신한 사용자의 감정 정보에 기반하여 사용자의 감정 상태에 대응하는 반려 동물 아바타의 동작을 결정하고, 반려 동물 아바타의 동작 정보를 단말에게 전송할 수 있다. 결정된 반려 동물 아바타의 동작은, 예를 들어, 사용자가 긍정적인 감정을 가지거나 반려 동물 아바타에게 장시간 집중하는 것으로 결정되는 경우에는 애교 부리기 또는 개인기 자랑하기로 결정될 수 있고, 사용자가 중립 또는 부정적인 감정을 가진 것으로 결정되는 경우에는 위로 하기로 결정될 수 있고, 사용자가 단말로부터 자리를 장시간 이탈하여 사용자로부터 일정 시간동안 아무런 감정이 검출되지 않거나 또는 사용자의 반려 동물 아바타에 대한 집중도가 하락하는 것으로 결정되는 경우에는 짖어서 호출하기로 결정될 수 있다.The server may determine the operation of the companion animal avatar corresponding to the user's emotional state based on the user's emotion information received from the terminal, and transmit the operation information of the companion animal avatar to the terminal. The determined motion of the companion animal avatar may be determined to be charming or show off a skill when the user has a positive emotion or concentrates on the companion animal avatar for a long time, and the user has a neutral or negative emotion. If it is determined to have it, it may be determined to comfort, and if it is determined that no emotion is detected from the user for a certain period of time due to the user leaving the terminal for a long time, or the user's concentration on the companion animal avatar decreases It may be decided to bark and call.

도 16은 본 발명의 다양한 실시 예들에 따라서 메타버스 가상 현실 속 반려 동물 아바타를 최초 분양 받는 과정의 일 예를 도시한다.16 illustrates an example of a process of initially pre-selling companion animal avatars in metaverse virtual reality according to various embodiments of the present disclosure.

도 16을 참조하면, 사용자가 메타버스 가상 현실 속 반려 동물 아바타와 교감하는 상황 중 하나로서, 반려 동물 아바타를 최초 분양 받는 상황이 도시된다.Referring to FIG. 16, as one of the situations in which a user interacts with a companion animal avatar in the metaverse virtual reality, a situation in which a companion animal avatar is first sold is shown.

사용자는 화면 속 다양한 반려 동물 아바타 중 본인이 마음에 드는 반려 동물 아바타를 선택한 뒤, 가상 현실 속 화폐, 코인 등의 재화를 지불한 후, 반려 동물 아바타를 분양 받을 수 있다.Users can select a companion animal avatar they like among various companion animal avatars on the screen, pay goods such as currency and coins in virtual reality, and then receive the companion animal avatar.

도 17은 본 발명의 다양한 실시 예들에 따라서 반려 동물 아바타에게 음성으로 이름을 지어주고 인식시키는 과정의 일 예를 도시한다.17 illustrates an example of a process of naming and recognizing companion animal avatars by voice according to various embodiments of the present disclosure.

도 17을 참조하면, 사용자는 최초 분양 받은 반려 동물 아바타에 대하여 이름을 지어줄 수 있으며, 이름을 음성으로 말하고 단말이 인식할 수 있다. 단말은 STT(speech to text)를 통해 인식된 이름을 사용자로부터 확인하고, 이름을 음성 인식할 때 사용자의 음성 메시지 속 감정을 분석하여, 긍정 감정의 경우 반려 동물 아바타의 사용자에 대한 교감도가 증가하도록 설정할 수 있다.Referring to FIG. 17 , the user may name the first companion animal avatar, speak the name by voice, and be recognized by the terminal. The terminal checks the recognized name from the user through STT (speech to text) and analyzes the user's emotion in the voice message when recognizing the name by voice. In the case of positive emotion, the companion animal avatar's sympathy with the user increases can be set to do so.

도 18은 본 발명의 다양한 실시 예들에 따라서 반려 동물 아바타와 훈련을 통해 사용자 감정 인식에 따른 교감도를 반영하는 과정의 일 예를 도시한다.18 illustrates an example of a process of reflecting a degree of rapport according to user emotion recognition through training with a companion animal avatar according to various embodiments of the present disclosure.

도 18을 참조하면, 사용자가 반려 동물 아바타와 교감을 할 수 있는 상황 중 하나로서 훈련 상황에서의 반려 동물 아바타의 반응 결정의 일 예가 도시된다.Referring to FIG. 18 , an example of determining a reaction of a companion animal avatar in a training situation as one of situations in which a user can communicate with a companion animal avatar is shown.

사용자가 반려 동물 아바타에 대한 지시를 음성 메시지, 또는 버튼 클릭 등으로 입력할 수 있다. 사용자의 지시 메시지를 인식한 단말은 서버에게 사용자의 지시 내용을 전송한다. 서버는 수신한 지시 내용에 기반하여 사용자의 지시에 대응하는 반려 동물 아바타의 동작을 결정할 수 있다. 또한, 단순히 지시 그 자체로만 인식하지 않고, 음성 메시지의 경우 음성 메시지 내 사용자의 감정이 분석될 수 있다. 사용자의 감정 정보에 대한 분석을 통해 반려 동물 아바타에 대한 호감 여부가 분석될 수 있다. 이를 통하여, 반려 동물 아바타에 대한 사용자의 호감이 분석되는 경우, 서버는 반려 동물 아바타에게 지시된 사항 외에 추가적으로 개인기를 동작하도록 결정할 수 있다.A user may input instructions for the companion animal avatar through a voice message or a button click. The terminal recognizing the user's instruction message transmits the user's instruction to the server. The server may determine an operation of the companion animal avatar corresponding to the user's instruction based on the received instruction. In addition, in the case of a voice message, the emotion of the user in the voice message may be analyzed instead of simply recognizing the instruction itself. Whether or not the user has a liking for the companion animal avatar may be analyzed through the analysis of the user's emotion information. Through this, when the user's liking for the companion animal avatar is analyzed, the server may determine to operate a skill skill in addition to instructions directed to the companion animal avatar.

긍정적인 감정이 추출되는 경우 교감도가 증가하고, 부정적인 감정이 추출되는 경우 교감도가 하락하며, 중립적인 감정이 추출되는 경우 교감도에 변동이 없을 수 있다.When positive emotions are extracted, the degree of sympathy increases, when negative emotions are extracted, the degree of sympathy decreases, and when neutral emotions are extracted, the degree of sympathy may not change.

본 발명의 다양한 실시 예들에 따르면, 사용자는 원하는 반려 동물 아바타를 선택하여 분양을 받은 후 메타버스 내 훈련소라는 공간에서 상호 작용을 할 수 있다. 예시적으로 다음의 상호 작용 방식을 통해 사용자와 반려 동물 아바타 사이의 교감도를 높일 수 있다. 예를 들어, 사용자는 반려 동물 아바타에게 이름을 지어 주고 음성으로 이름을 부르는 것과 쓰다듬기라는 메뉴를 통해 주인임을 인식시켜 교감도를 높일 수 있다. 예를 들어, 사용자는 반려 동물 아바타에게 밥 주기, 물 주기, 오줌 및 똥 치우기라는 생존과 직결되는 돌봄 행위의 정기적인 수행을 통해서 교감도를 높일 수 있다. 예를 들어, 사용자는 반려 동물 아바타에게 간식을 주며 하는 간단한 훈련들, 이를 테면, 앉아, 일어나, 엎드려, 뒤집어, 기다려 등의 훈련을 통해서 교감도를 높일 수 있다. 예를 들어, 사용자는 반려 동물 아바타에게 산책, 놀아주기, 여행을 통해서 교감도를 높인다.According to various embodiments of the present invention, a user can select a desired companion animal avatar and receive a sale, and then interact in a space called a training center in the metaverse. For example, the degree of sympathy between the user and the companion animal avatar may be increased through the following interaction method. For example, the user can increase the level of rapport by naming the companion animal avatar and recognizing it as the owner through menus such as calling the animal by voice and petting it. For example, the user can increase the level of sympathy by regularly performing caring actions directly related to survival, such as feeding, watering, and cleaning up urine and excrement to the companion animal avatar. For example, the user can increase the level of rapport through simple training such as sitting, getting up, lying down, turning over, and waiting while giving snacks to the companion animal avatar. For example, a user increases rapport with a companion animal avatar by taking a walk, playing with it, or traveling.

상술한 예시적인 상호 작용 과정에서 사용자 얼굴 인식 기반의 감성 인식을 통해 그 결과를 사용자와 반려 동물 아바타 사이의 교감도에 반영할 수 있다. 예를 들어, 사용자의 3가지 집중 유형, 즉, 보통 상태, 집중 상태, 몰입 상태 여부가 측정될 수 있다. 예를 들어, 사용자의 7가지 감정 유형, 즉, 즐거움, 놀람, 슬픔, 화남, 두려움, 불쾌함, 덤덤함 여부가 측정될 수 있다. 각각의 집중 유형, 감정 유형과 함께 집중 정도, 감정의 강도가 함께 측정될 수 있다.In the above-described exemplary interaction process, the result may be reflected in the degree of sympathy between the user and the companion animal avatar through emotion recognition based on user face recognition. For example, three types of attention of the user, that is, a normal state, a concentration state, and an immersive state may be measured. For example, seven types of emotions of the user, that is, pleasure, surprise, sadness, anger, fear, displeasure, and calmness may be measured. Each type of concentration, type of emotion, degree of concentration, and intensity of emotion can be measured together.

반려 동물 아바타와 상호작용하는 순간의 사용자의 집중도가 높을수록 교감도 상승에 가중치가 부여될 수 있다. 또한, 사용자의 7가지 감정 유형 중 긍정으로 분류되는 즐거움, 또는 놀람의 감정의 경우 교감도 상승에 가중치가 부여될 수 있다.The higher the user's concentration at the moment of interaction with the companion animal avatar, the more weight can be assigned to the increase in sympathy. Also, in the case of emotions of pleasure or surprise that are classified as positive among the seven types of emotions of the user, a weight may be assigned to an increase in sympathy.

사용자의 집중도 및 긍정적인 감정이 높을 경우, 가상 세계 속 반려 동물 아바타에게 사용자의 감정이 전달되어, 반려 동물 아바타의 성장 속도, 훈련 속도 및 성과, 충성도가 증가할 수 있다.If the user's concentration and positive emotions are high, the user's emotions are transmitted to the companion animal avatar in the virtual world, and the companion animal avatar's growth rate, training speed and performance, and loyalty may increase.

교감도가 높은 사용자와 반려 동물 아바타의 경우, 사용자의 얼굴인식 기반의 감정에 반려 동물 아바타가 먼저 반응하여 일정한 행동을 수행할 수 있다. 사용자가 슬픈 경우 위로를 위해 핥아주기, 기분이 나쁜 경우 애교 부리기 또는 사용자가 좋아하는 물건을 물어오기 등의 동작이 사용자의 지시 없이도 먼저 수행될 수 있다.In the case of a user with a high degree of sympathy and a companion animal avatar, the companion animal avatar may perform a certain action by first responding to the user's emotion based on face recognition. An operation such as licking the user for consolation when the user is sad, acting cute when the user is in a bad mood, or asking for an object the user likes may be performed first without the user's instruction.

본 발명의 다양한 실시 예들에 따르면, 사용자가 가상 세계 속 반려 동물 아바타를 양육하고 상호 작용하는 과정에서 단말 화면 속 메뉴 선택, 음성 지시뿐만 아니라 얼굴 인식 기반의 감성 인식을 활용할 수 있다. 그래서, 굳이 사용자가 말을 하거나 몸짓으로 표현하지 않아도 얼굴 표정만으로 반려 동물 아바타가 사용자가 느끼고 있는 감성들을 인식할 수 있고 이를 가상 세계 속 반려 동물 아바타와의 상호 작용에 적극적으로 활용할 수 있다.According to various embodiments of the present invention, in the process of a user raising and interacting with a companion animal avatar in a virtual world, it is possible to utilize facial recognition-based emotion recognition as well as menu selection and voice instructions in a terminal screen. So, even if the user does not have to speak or express with gestures, the companion animal avatar can recognize the emotions the user is feeling with only facial expressions, and it can be actively used in interaction with the companion animal avatar in the virtual world.

본 발명의 다양한 실시 예들에서, 사용자의 감성 지표는 감정지수, 집중지수를 포함한다. 각 지표들의 간단한 설명은 다음과 같다.In various embodiments of the present disclosure, the user's emotion index includes an emotion index and a concentration index. A brief explanation of each indicator is as follows.

사용자의 감정 지수는 7가지 감정 유형, 즉, 즐거움, 놀람, 슬픔, 화남, 두려움, 불쾌함, 덤덤함에 관한 지수를 포함할 수 있으며, 0부터 100 사이의 값으로 표현되며 값이 높을수록 긍정적이다.The user's emotion index can include 7 types of emotions, that is, joy, surprise, sadness, anger, fear, displeasure, and calmness. It is expressed as a value between 0 and 100, and the higher the value, the more positive it is. .

사용자의 집중지수는 3단계 집중 상태, 즉, 보통 상태, 집중 상태, 몰입 상태에 관한 지수를 포함할 수 있으며, 0부터 100 사이의 값으로 표현되며 값이 높을수록 집중도가 높다.The user's concentration index may include three levels of concentration states, that is, an index of a normal state, a concentration state, and an immersion state, and is expressed as a value between 0 and 100, and the higher the value, the higher the degree of concentration.

반려 동물 아바타의 이름 짓기 및 반려 동물 아바타의 이름 부르기에서 음성을 통한 감성 인식을 통해 긍정적인 감정일수록 교감도가 빠르게 상승한다.Through the naming of companion animal avatars and emotional recognition through voice in calling companion animal avatar names, the more positive emotions, the faster the sympathy rises.

반려 동물 아바타의 훈련에서 사용자의 집중 지수가 높고, 감정 지수의 즐거움과 놀람이 높을수록 교감도가 빠르게 상승한다.In training companion animal avatars, the higher the concentration index of the user and the higher the pleasure and surprise of the emotional index, the faster the sympathy rises.

반려 동물 아바타의 산책 및 놀이에서 사용자의 집중 지수가 높고, 감정 지수의 즐거움이 높을수록 교감도가 빠르게 상승한다.The higher the user's concentration index and the higher the enjoyment of the emotional index, the faster the rapport rises in walking and playing with the companion animal avatar.

일 실시 예에 따르면, 상기 반려 동물의 상기 설정 정보는 상기 반려 동물에 대한 동물 종류, 상기 동물 종류 중 상기 반려 동물에 대한 세부 종, 상기 반려 동물의 외관 상 세부 특징 중 적어도 하나를 포함할 수 있다. 상기 반려 동물의 상기 설정 정보는 상기 반려 동물의 신체 특성, 행동 특성, 성격 특성을 포함하는 기본 특성 설정 정보를 더 포함할 수 있다. 상기 기본 특성 설정 정보에 기반하여 상기 반려 동물 아바타의 성격 정보가 생성될 수 있다. 상기 성격 정보는 상기 반려 동물 아바타의 상기 사용자에 대한 교감도, 상기 반려 동물 아바타의 상기 사용자에 대한 충성도, 상기 반려 동물 아바타의 훈련 속도, 상기 반려 동물 아바타의 성장 속도 중 적어도 하나와 관련 있을 수 있다.According to an embodiment, the setting information of the companion animal may include at least one of an animal type of the companion animal, a detailed species of the companion animal among the animal types, and detailed external features of the companion animal. . The setting information of the companion animal may further include basic characteristic setting information including body characteristics, behavioral characteristics, and personality characteristics of the companion animal. Personality information of the companion animal avatar may be generated based on the basic characteristic setting information. The personality information may be related to at least one of the sympathy of the companion animal avatar with the user, the loyalty of the companion animal avatar with the user, the training speed of the companion animal avatar, and the growth speed of the companion animal avatar. .

일 실시 예에 따르면, 상기 메모리는 상기 사용자의 상기 얼굴 이미지에 기반하여 상기 사용자의 상기 감성 정보를 분석하기 위한 기계 학습 모델을 저장할 수 있다. 상기 사용자의 상기 감성 정보는 미리 결정된 복수의 감정 유형 중 임계 비율 이상 감지되는 하나 이상의 감정 유형의 정보, 미리 결정된 집중도 스케일 중 상기 얼굴 이미지로부터 감지되는 상기 사용자의 상기 반려 동물 아바타에 대한 집중도 수치 값의 정보를 포함할 수 있다. 각각의 상기 미리 결정된 복수의 감정 유형은 상기 반려 동물 아바타와의 관계와 관련하여 긍정 또는 부정의 감정 수치 값이 미리 결정될 수 있다. 상기 감성 정보는 상기 감지되는 하나 이상의 감정에 따른 상기 감정 유형의 정보, 감정 수치 값 및 상기 감지되는 상기 집중도 수치 값을 더 포함할 수 있다.According to an embodiment, the memory may store a machine learning model for analyzing the emotion information of the user based on the face image of the user. The emotion information of the user includes information of one or more emotion types detected at a threshold rate or higher among a plurality of predetermined emotion types, and concentration numerical values for the companion animal avatar of the user detected from the face image among a predetermined concentration scale. information may be included. Each of the plurality of predetermined emotion types may have a positive or negative emotional value in relation to the companion animal avatar. The emotion information may further include the emotion type information according to the one or more sensed emotions, the emotion numerical value, and the detected intensity numerical value.

일 실시 예에 따르면, 상기 감성 정보를 전송하는 과정과 상기 감성 정보에 기반하여 생성된 상기 반려 동물 아바타의 상기 제2 동작 정보를 수신하는 과정을 반복함으로써 상기 반려 동물 아바타의 상기 성격 정보가 수정될 수 있다. 상기 성격 정보에 기반하여 상기 반려 동물 아바타의 새로운 상기 제2 동작 정보가 생성되고 수신될 수 있다.According to an embodiment, the personality information of the companion animal avatar may be modified by repeating the process of transmitting the emotion information and the process of receiving the second motion information of the companion animal avatar generated based on the emotion information. can Based on the personality information, new second motion information of the companion animal avatar may be generated and received.

일 실시 예에 따르면, 상기 방법은, 상기 서버로부터 상기 반려 동물 아바타가 처한 제1 상황에 대한 임의의 시뮬레이션 정보를 상기 제1 동작 정보와 함께 수신하는 과정을 더 포함할 수 있다. 상기 감성 정보는 상기 제1 상황에서 상기 반려 동물 아바타의 상기 제1 동작 정보에 기반한 동작에 대한 상기 사용자의 상기 감정 유형의 정보, 상기 감정 수치 값 및 상기 집중도 수치 값을 포함할 수 있다. 상기 제1 상황과 유사한 제2 상황의 시뮬레이션 정보가 상기 제2 동작 정보와 함께 수신되는 경우, 상기 반려 동물 아바타는 상기 감성 정보에 기반하여 생성된 상기 제2 동작 정보에 따라서 동작할 수 있다.According to an embodiment, the method may further include receiving, from the server, arbitrary simulation information about a first situation of the companion animal avatar together with the first action information. The emotion information may include information on the emotion type of the user for a motion based on the first motion information of the companion animal avatar in the first situation, the emotional numerical value, and the concentration numerical value. When simulation information of a second situation similar to the first situation is received together with the second motion information, the companion animal avatar may operate according to the second motion information generated based on the emotion information.

일 실시 예에 따르면, 상기 입력 장치는 마이크를 포함할 수 있다. 상기 방법은, 상기 마이크를 통해 획득한 상기 사용자의 상기 반려 동물 아바타에 대한 음성 메시지를 상기 서버에게 전송하는 과정을 더 포함할 수 있다. 상기 음성 메시지가 설정된 상기 반려 동물 아바타의 이름을 포함할수록 긍정의 감정 수치 값 및 집중도 수치 값이 높게 포함된 상기 감성 정보가 상기 서버에게 전송될 수 있다. 상기 반려 동물 아바타의 상기 성격 정보는 상기 감성 정보에 기반하여 상기 사용자에 대한 교감도가 수정될 수 있다. 상기 성격 정보에 기반하여 상기 반려 동물 아바타의 새로운 상기 제2 동작 정보가 생성되고 수신될 수 있다.According to one embodiment, the input device may include a microphone. The method may further include transmitting a voice message for the companion animal avatar of the user acquired through the microphone to the server. As the voice message includes the set name of the companion animal avatar, the emotional information including a higher positive emotion numerical value and higher concentration numerical value may be transmitted to the server. The personality information of the companion animal avatar may have a degree of sympathy with the user modified based on the emotion information. Based on the personality information, new second motion information of the companion animal avatar may be generated and received.

일 실시 예에 따르면, 상기 반려 동물 아바타의 상기 성격 정보는 상기 감성 정보에 기반하여 상기 사용자에 대한 교감도가 수정될 수 있다. 상기 반려 동물 아바타가 처한 상기 제1 상황의 종류에 따라서, 상기 사용자의 상기 감정 유형의 정보, 상기 감정 수치 값 및 상기 집중도 수치 값에 대응하여 상기 반려 동물 아바타의 상기 성격 정보가 수정되는 정도가 결정될 수 있다. 상기 제1 상황의 종류가 미리 결정된 복수의 상황들 중 하나이고, 각각의 상기 미리 결정된 복수의 상황들에 대하여 미리 결정된 각각의 대응되는 감정 유형이 상기 사용자의 상기 감정 유형의 정보에 포함되는 경우, 상기 성격 정보가 수정되는 정도가 높을 수 있다.According to an embodiment, the personality information of the companion animal avatar may have a degree of sympathy with the user modified based on the emotion information. The degree to which the personality information of the companion animal avatar is modified in correspondence to the emotion type information, the emotion numerical value, and the concentration numerical value of the user is determined according to the type of the first situation in which the companion animal avatar is placed. can When the type of the first situation is one of a plurality of predetermined situations, and each corresponding emotion type predetermined for each of the plurality of predetermined situations is included in the information of the emotion type of the user; A degree of modification of the personality information may be high.

일 실시 예에 따르면, 상기 미리 결정된 복수의 감정 유형은 즐거움, 놀람, 슬픔, 화남, 두려움, 불쾌함, 덤덤함 중 하나 이상을 포함할 수 있다. 상기 미리 결정된 복수의 상황들은 훈련, 산책, 놀이를 포함할 수 있다. 상기 제1 상황의 종류가 훈련인 경우, 상기 사용자의 상기 감정 유형의 정보가 즐거움 또는 놀람인 경우, 상기 성격 정보가 수정되는 정도가 높을 수 있다. 상기 제1 상황의 종류가 산책 또는 놀이인 경우, 상기 사용자의 상기 감정 유형의 정보가 즐거움인 경우, 상기 성격 정보가 수정되는 정도가 높을 수 있다.According to an embodiment, the plurality of predetermined emotion types may include one or more of joy, surprise, sadness, anger, fear, displeasure, and calmness. The plurality of predetermined situations may include training, walking, and play. When the type of the first situation is training, and when the information of the emotion type of the user is joy or surprise, the degree to which the personality information is modified may be high. When the type of the first situation is walking or playing, and when the information on the emotion type of the user is enjoyment, the degree to which the personality information is modified may be high.

하드웨어를 이용하여 본 발명의 실시 예를 구현하는 경우에는, 본 발명을 수행하도록 구성된 ASICs(application specific integrated circuits) 또는 DSPs(digital signal processors), DSPDs(digital signal processing devices), PLDs(programmable logic devices), FPGAs(field programmable gate arrays) 등이 본 발명의 프로세서에 구비될 수 있다.In the case of implementing the embodiment of the present invention using hardware, ASICs (application specific integrated circuits) or DSPs (digital signal processors), DSPDs (digital signal processing devices), PLDs (programmable logic devices) configured to perform the present invention , FPGAs (field programmable gate arrays), etc. may be provided in the processor of the present invention.

한편, 상술한 방법은, 컴퓨터에서 실행될 수 있는 프로그램으로 작성 가능하고, 컴퓨터 판독 가능 매체를 이용하여 상기 프로그램을 동작시키는 범용 디지털 컴퓨터에서 구현될 수 있다. 또한, 상술한 방법에서 사용된 데이터의 구조는 컴퓨터 판독 가능한 저장 매체에 여러 수단을 통하여 기록될 수 있다. 본 발명의 다양한 방법들을 수행하기 위한 실행 가능한 컴퓨터 코드를 포함하는 저장 디바이스를 설명하기 위해 사용될 수 있는 프로그램 저장 디바이스들은, 반송파(carrier waves)나 신호들과 같이 일시적인 대상들은 포함하는 것으로 이해되지는 않아야 한다. 상기 컴퓨터 판독 가능한 저장 매체는 마그네틱 저장매체(예를 들면, 롬, 플로피 디스크, 하드 디스크 등), 광학적 판독 매체(예를 들면, 시디롬, DVD 등)와 같은 저장 매체를 포함한다.Meanwhile, the above-described method can be written as a program that can be executed on a computer, and can be implemented in a general-purpose digital computer that operates the program using a computer-readable medium. In addition, the structure of data used in the above-described method may be recorded on a computer-readable storage medium through various means. Program storage devices, which may be used to describe a storage device containing executable computer code for performing various methods of the present invention, should not be construed as including transitory objects such as carrier waves or signals. do. The computer-readable storage media includes storage media such as magnetic storage media (eg, ROM, floppy disk, hard disk, etc.) and optical reading media (eg, CD-ROM, DVD, etc.).

이상에서 설명된 실시 예들은 본 발명의 구성요소들과 특징들이 소정 형태로 결합된 것들이다. 각 구성요소 또는 특징은 별도의 명시적 언급이 없는 한 선택적인 것으로 고려되어야 한다. 각 구성요소 또는 특징은 다른 구성요소나 특징과 결합되지 않은 형태로 실시될 수 있다. 또한, 일부 구성요소들 및/또는 특징들을 결합하여 본 발명의 실시 예를 구성하는 것도 가능하다. 발명의 실시 예들에서 설명되는 동작들의 순서는 변경될 수 있다. 어느 실시 예의 일부 구성이나 특징은 다른 실시 예에 포함될 수 있고, 또는 다른 실시 예의 대응하는 구성 또는 특징과 교체될 수 있다. 특허청구범위에서 명시적인 인용 관계가 있지 않은 청구항들을 결합하여 실시 예를 구성하거나 출원 후의 보정에 의해 새로운 청구항으로 포함시킬 수 있음은 자명하다.The embodiments described above are those in which elements and features of the present invention are combined in a predetermined form. Each component or feature should be considered optional unless explicitly stated otherwise. Each component or feature may be implemented in a form not combined with other components or features. In addition, it is also possible to configure an embodiment of the present invention by combining some elements and/or features. The order of operations described in the embodiments of the invention may be changed. Some components or features of one embodiment may be included in another embodiment, or may be replaced with corresponding components or features of another embodiment. It is obvious that claims that do not have an explicit citation relationship in the claims can be combined to form an embodiment or can be included as new claims by amendment after filing.

본 발명이 본 발명의 기술적 사상 및 본질적인 특징을 벗어나지 않고 다른 형태로 구체화될 수 있음은 본 발명이 속한 분야 통상의 기술자에게 명백할 것이다. 따라서, 상기 실시 예는 제한적인 것이 아니라 예시적인 모든 관점에서 고려되어야 한다. 본 발명의 권리범위는 첨부된 청구항의 합리적 해석 및 본 발명의 균등한 범위 내 가능한 모든 변화에 의하여 결정되어야 한다.It will be clear to those skilled in the art that the present invention can be embodied in other forms without departing from the technical spirit and essential characteristics of the present invention. Accordingly, the above embodiments should be considered in all respects as illustrative rather than restrictive. The scope of the present invention should be determined by reasonable interpretation of the appended claims and all possible changes within the equivalent scope of the present invention.

100: 사용자 단말 110: 송수신기
120: 메모리 130: 프로세서
140: 입력 장치 150: 출력 장치
160: 카메라 200: 서버
210: 송수신기 220: 메모리
230: 프로세서 300: 유/무선 통신 네트워크
500: 인공 신경망 510: 입력 계층
511: 입력 정보 530: 은닉 계층
531: 제1 은닉 계층 532: 제1 유닛
533: 제2 은닉 계층 534: 제2 유닛
550: 출력 계층 551: 예측 결과 유닛100: user terminal 110: transceiver
120: memory 130: processor
140: input device 150: output device
160: camera 200: server
210: transceiver 220: memory
230: processor 300: wired / wireless communication network
500: artificial neural network 510: input layer
511: input information 530: hidden layer
531: first hidden layer 532: first unit
533: second hidden layer 534: second unit
550: output layer 551: prediction result unit

Claims

A method of operating a terminal of a user in a communication system, wherein the terminal includes a transceiver, a memory, a processor, an input device, an output device, and a camera,
Transmitting the setting information of the companion animal input through the input device to the server;
Receiving first motion information about the appearance and motion of a companion animal avatar in a virtual space from the server;
outputting a motion of the companion animal avatar through the output device based on the first motion information;
obtaining a face image of the user looking at the companion animal avatar of the output device through the camera;
transmitting emotion information of the user generated based on the face image to the server;
Receiving second motion information of the companion animal avatar generated based on the emotional information from the server;
And outputting the motion of the companion animal avatar in the virtual space through the output device based on the second motion information.
method.

According to claim 1,
The setting information of the companion animal includes at least one of an animal type of the companion animal, a detailed species of the companion animal among the animal types, and detailed features of the companion animal's appearance;
The setting information of the companion animal further includes basic characteristic setting information including body characteristics, behavioral characteristics, and personality characteristics of the companion animal;
Characteristic information of the companion animal avatar is generated based on the basic characteristic setting information;
The personality information is related to at least one of the sympathy of the companion animal avatar with the user, the loyalty of the companion animal avatar with the user, the training speed of the companion animal avatar, and the growth rate of the companion animal avatar,
method.

According to claim 2,
The memory stores a machine learning model for analyzing the emotion information of the user based on the face image of the user,
The emotion information of the user includes information of one or more emotion types detected at a threshold rate or higher among a plurality of predetermined emotion types, and concentration numerical values for the companion animal avatar of the user detected from the face image among a predetermined concentration scale. contains information;
Each of the plurality of predetermined emotion types has a predetermined positive or negative emotion value in relation to the companion animal avatar,
The emotion information further includes the emotion type information, the emotional numerical value, and the detected concentration numerical value according to the one or more detected emotions.
method.

According to claim 3,
The personality information of the companion animal avatar is modified by repeating the process of transmitting the emotion information and the process of receiving the second motion information of the companion animal avatar generated based on the emotion information;
Based on the personality information, the new second motion information of the companion animal avatar is generated and received.
method.

According to claim 3,
Further comprising receiving, from the server, arbitrary simulation information about a first situation of the companion animal avatar together with the first motion information;
The emotion information includes information on the emotion type of the user for a motion based on the first motion information of the companion animal avatar in the first situation, the emotional numerical value, and the concentration numerical value;
When simulation information of a second situation similar to the first situation is received together with the second motion information, the companion animal avatar operates according to the second motion information generated based on the emotion information.
method.

According to claim 5,
The input device includes a microphone,
Transmitting a voice message for the companion animal avatar of the user obtained through the microphone to the server;
As the voice message includes the name of the set companion animal avatar, the emotional information including a higher positive emotion numerical value and a higher concentration numerical value is transmitted to the server,
The personality information of the companion animal avatar is modified based on the emotional information, the degree of sympathy with the user,
Based on the personality information, the new second motion information of the companion animal avatar is generated and received.
method.

According to claim 5,
The personality information of the companion animal avatar is modified based on the emotional information, the degree of sympathy with the user,
The degree to which the personality information of the companion animal avatar is modified is determined according to the type of the first situation in which the companion animal avatar is located, corresponding to the emotion type information, the emotion numerical value, and the concentration numerical value of the user. becomes,
When the type of the first situation is one of a plurality of predetermined situations, and each corresponding emotion type predetermined for each of the plurality of predetermined situations is included in the information of the emotion type of the user; The degree to which the personality information is modified is high,
The plurality of predetermined emotion types include one or more of joy, surprise, sadness, anger, fear, displeasure, and calmness;
The predetermined plurality of situations include training, walking, and play,
When the type of the first situation is training, and when the information of the emotion type of the user is enjoyment or surprise, the degree to which the personality information is modified is high;
When the type of the first situation is walking or playing, and when the information of the emotion type of the user is enjoyment, the degree to which the personality information is modified is high.
method.

According to claim 1,
The emotion information of the user includes information of one or more emotion types detected at a threshold rate or more among a plurality of predetermined emotion types;
The plurality of predetermined emotion types include one or more of joy, surprise, sadness, anger, fear, displeasure, and calmness;
When the information of the emotion type of the user is sadness, the second action information includes a comforting action;
When the information on the emotion type of the user is displeasure, the second motion information includes an action of flirting or a motion of the companion animal avatar that the user likes in advance.
method.

In a user's terminal in a communication system,
Including a transceiver, memory, processor, input device, output device, camera,
wherein the processor is configured to perform the method according to any one of claims 1 to 8;
server.

A computer program recorded on a computer readable storage medium, configured to perform the method according to any one of claims 1 to 8.