KR102565321B1

KR102565321B1 - Character emotion recognition and tagging appratus, character emotion recognition and tagging method, and character emotion recognition and tagging system including the character emotion recognition and tagging apparatus

Info

Publication number: KR102565321B1
Application number: KR1020210010843A
Authority: KR
Inventors: 천애리; 신정민
Original assignee: 주식회사 플랫팜
Priority date: 2021-01-26
Filing date: 2021-01-26
Publication date: 2023-08-09
Also published as: KR20220107772A

Abstract

본 개시의 일 양상으로, 송수신기(transceiver); 적어도 하나의 프로세서; 및 상기 적어도 하나의 프로세서에 동작 가능하게 연결되어 상기 적어도 하나의 프로세서가 동작들을 수행하도록 하는 적어도 하나의 명령어들(instructions)을 저장하는 적어도 하나의 메모리(memory)를 포함하고, 상기 동작들은: 상기 송수신기를 통하여 사용자 단말로부터 캐릭터 객체를 포함하는 입력 이미지 데이터를 수신하고, 상기 입력 이미지 데이터의 특징(feature) 및 기 학습된 분류 모델에 기초하여 상기 캐릭터 객체를 복수의 감정 카테고리 중 하나의 감정 카테고리로 분류하고, 상기 하나의 감정 카테고리를 상기 입력 이미지 데이터에 태깅(tagging)하여 태깅된 입력 이미지 데이터를 생성하고, 및 상기 송수신기를 통하여 상기 태깅된 입력 이미지 데이터를 상기 사용자 단말에 전송하고, 상기 기 학습된 분류 모델은 상기 적어도 하나의 메모리에 저장된 훈련 이미지 데이터에 포함된 캐릭터 객체의 통합 AU(action unit)를 상기 특징으로 하여 학습되는, 캐릭터 감정 인식 및 태깅 장치이다.In one aspect of the present disclosure, a transceiver; at least one processor; and at least one memory operatively connected to the at least one processor to store at least one instruction to cause the at least one processor to perform operations comprising: Input image data including a character object is received from a user terminal through a transceiver, and the character object is classified into one emotion category among a plurality of emotion categories based on a feature of the input image data and a pre-learned classification model. classifying, tagging the input image data with the one emotion category to generate tagged input image data, and transmitting the tagged input image data to the user terminal through the transceiver; The classified model is an apparatus for recognizing and tagging character emotions, which is learned based on an integrated action unit (AU) of a character object included in the training image data stored in the at least one memory.

Description

Character emotion recognition and tagging system including a character emotion recognition and tagging device, a character emotion recognition and tagging method, and a character emotion recognition and tagging device SYSTEM INCLUDING THE CHARACTER EMOTION RECOGNITION AND TAGGING APPARATUS}

본 개시 (present disclosure)는 인공지능 시각(상황)처리 분야에서의 캐릭터 감정 인식 및 태깅 장치, 캐릭터 감정 인식 및 태깅 방법 및 캐릭터 감정 인식 및 태깅 장치를 포함하는 캐릭터 감정 인식 및 태깅 시스템에 관한 것이다.The present disclosure relates to a character emotion recognition and tagging system including a character emotion recognition and tagging device, a character emotion recognition and tagging method, and a character emotion recognition and tagging device in the field of artificial intelligence visual (situation) processing.

콘텐츠 분석, 검색 및 제공은 딥러닝 기반의 인공지능 기술 발전으로 급격한 발전이 이루어지고 있으며, 다양한 플랫폼 분야에 적용되어 활용될 것으로 전망된다. 예를 들어, 경험재 콘텐츠의 질과 취향에 대한 불확실성을 줄이기 위하여, 디지털 콘텐츠의 소비 의사결정과정에 있어서 사용자 맞춤형 콘텐츠가 인공지능 기술에 기반하여 제공될 수 있다. 사용자 맞춤형 콘텐츠는 인공지능 기술, 특히 콘텐츠 관련 기술에서 중요한 키워드로 부각되고 있다. 이와 같이 규모의 경제가 중요한 콘텐츠 시장에 인공지능 기술은 필수적인 요소이다.Content analysis, search and provision are rapidly developing with the development of deep learning-based artificial intelligence technology, and are expected to be applied and utilized in various platform fields. For example, in order to reduce uncertainty about the quality and taste of experiential content, user-customized content can be provided based on artificial intelligence technology in the decision-making process of digital content consumption. User-customized content is emerging as an important keyword in artificial intelligence technology, especially content-related technology. In the content market where economies of scale are important, artificial intelligence technology is an essential element.

또한, 최근에는 콘텐츠 유통/서비스 모바일을 통한 비대면 커뮤니케이션이 증가하고 있는 추세이다. 모바일 콘텐츠 산업의 발달로 급증하고 있는 콘텐츠 업로드의 속도와 수량에 맞춰 업로드되는 콘텐츠의 분류 및 분석이 필요하다.In addition, non-face-to-face communication through content distribution/service mobile is increasing in recent years. With the development of the mobile content industry, it is necessary to classify and analyze uploaded content according to the speed and quantity of content uploads that are rapidly increasing.

다만, 현재 기술은 다양한 콘텐츠들 중 텍스트나 스피치 콘텐츠 쪽에 중점을 두고 있으며, 다양한 멀티미디어에 대한 기술적 발전이 부족한 실정이다. 특히, 캐릭터나 이모티콘 등과 같은 창작물 콘텐츠의 경우, 업로드부터 배포 과정에서 창작물에 포함된 캐릭터의 감정을 자동으로 인식, 분류 및 이를 태깅해줄 수 있는 사용자 맞춤형 콘텐츠 제공 서비스가 전무한 실정이다.However, current technology focuses on text or speech content among various contents, and technological development for various multimedia is insufficient. In particular, in the case of creative content such as characters or emoticons, there is no user-customized content providing service capable of automatically recognizing, classifying, and tagging the emotions of characters included in the creation during the process of uploading and distributing them.

대한민국 공개특허 10-2018-0111467Republic of Korea Patent Publication 10-2018-0111467 대한민국 등록특허 10-2110393Korean Registered Patent No. 10-2110393

본 개시의 다양한 예들은 콘텐츠의 업로드 및 배포 과정에서 콘텐츠에 포함된 캐릭터의 감정을 자동으로 인식 및 태깅할 수 있는 캐릭터 감정 인식 및 태깅 장치, 캐릭터 감정 인식 및 태깅 방법 및 캐릭터 감정 인식 및 태깅 장치를 포함하는 캐릭터 감정 인식 및 태깅 시스템을 제공하기 위함이다.Various examples of the present disclosure include a character emotion recognition and tagging device, a character emotion recognition and tagging method, and a character emotion recognition and tagging device capable of automatically recognizing and tagging a character's emotion included in content in the process of uploading and distributing content. This is to provide a character emotion recognition and tagging system including

본 개시의 다양한 예들에서 이루고자 하는 기술적 과제들은 이상에서 언급한 사항들로 제한되지 않으며, 언급하지 않은 또 다른 기술적 과제들은 이하 설명할 본 개시의 다양한 예들로부터 당해 기술분야에서 통상의 지식을 가진 자에 의해 고려될 수 있다.The technical problems to be achieved in various examples of the present disclosure are not limited to those mentioned above, and other technical problems not mentioned above can be solved by those skilled in the art from various examples of the present disclosure to be described below. can be considered by

상기 통합 AU는 인간적 AU 및 캐릭터적 AU 를 포함할 수 있다.The unified AU may include a human AU and a character AU.

상기 인간적 AU는 상기 훈련 이미지 데이터에 포함된 캐릭터 객체의 눈썹, 눈, 코, 입 및 볼 각각에 포함된 복수의 특징점 간 거리 및 비율 중 적어도 하나를 달리 갖는 복수의 눈썹 AU, 복수의 눈 AU, 복수의 코 AU, 복수의 입 AU 및 복수의 볼 AU 중 적어도 하나를 포함할 수 있다.The human AU includes a plurality of eyebrow AUs, a plurality of eye AUs having different distances and ratios of at least one of a plurality of feature points included in each of the eyebrows, eyes, nose, mouth, and cheeks of the character object included in the training image data; It may include at least one of a plurality of nose AUs, a plurality of mouth AUs, and a plurality of cheek AUs.

상기 캐릭터적 AU는 상기 훈련 이미지 데이터에 포함된 배경 객체 AU 및 상기 훈련 이미지 데이터에 포함된 감정 객체 AU 중 적어도 하나를 포함할 수 있다.The character AU may include at least one of a background object AU included in the training image data and an emotional object AU included in the training image data.

상기 감정 객체 AU는 상기 훈련 이미지 데이터에서 상기 캐릭터 객체를 제외한 나머지 객체 중 상기 캐릭터 객체의 에지(edge)와 적어도 일부가 중첩되는 에지를 포함하는 객체에 대응되고, 및 상기 배경 객체 AU는 상기 훈련 이미지 데이터에서 상기 캐릭터 객체 및 상기 감정 객체 AU를 제외한 나머지 객체에 대응될 수 있다.The emotion object AU corresponds to an object including an edge at least partially overlapping with an edge of the character object among objects other than the character object in the training image data, and the background object AU corresponds to the training image Data may correspond to objects other than the character object and the emotion object AU.

상기 동작들은: 상기 송수신기를 통하여 상기 사용자 단말로부터 분류 옵션 데이터 - 상기 분류 옵션 데이터는 상기 배경 객체 AU 및 상기 감정 객체 AU 중 적어도 하나를 선택하기 위한 선택 정보를 포함 -; 를 수신하고, 상기 배경 객체 AU 및 상기 감정 객체 AU 중 상기 선택 정보에 대응되는 AU를 선택하고, 및 상기 인간적 AU, 상기 선택 정보에 대응되는 AU 및 상기 기 학습된 분류 모델에 기초하여 상기 캐릭터 객체를 상기 하나의 감정 카테고리로 분류하는 것을 더 포함할 수 있다.The operations include: classification option data from the user terminal via the transceiver, the classification option data including selection information for selecting at least one of the background object AU and the emotion object AU; , selects an AU corresponding to the selection information from among the background object AU and the emotion object AU, and the character object based on the human AU, the AU corresponding to the selection information, and the pre-learned classification model It may further include classifying into the one emotion category.

상기 동작들은: 상기 송수신기를 통하여 상기 태깅된 입력 이미지 데이터에 대한 평가 데이터를 수신하고, 및 상기 평가 데이터가 기 설정된 임계 값 미만이면, 상기 인간적 AU, 상기 배경 객체 AU 및 상기 감정 객체 AU 중 적어도 하나 및 상기 기 학습된 분류 모델에 기초하여 상기 캐릭터 객체를 상기 하나의 감정 카테고리로 분류하는 것을 더 포함할 수 있다.The operations include: receiving evaluation data for the tagged input image data through the transceiver, and if the evaluation data is less than a preset threshold value, at least one of the human AU, the background object AU, and the emotion object AU and classifying the character object into the one emotion category based on the pre-learned classification model.

상기 통합 AU 각각은 상기 복수의 감정 카테고리 중 적어도 하나에 라벨링(labeling)될 수 있다.Each of the unified AUs may be labeled with at least one of the plurality of emotion categories.

본 개시의 다른 일 양상으로, 캐릭터 감정 인식 및 태깅 장치에 의해 수행되는 캐릭터 감정 인식 및 태깅 방법으로서, 사용자 단말로부터 캐릭터 객체를 포함하는 입력 이미지 데이터를 수신하고; 상기 입력 이미지 데이터의 특징(feature) 및 기 학습된 분류 모델에 기초하여 상기 캐릭터 객체를 복수의 감정 카테고리 중 하나의 감정 카테고리로 분류하고; 상기 하나의 감정 카테고리를 상기 입력 이미지 데이터에 태깅(tagging)하여 태깅된 입력 이미지 데이터를 생성하고; 및 상기 태깅된 입력 이미지 데이터를 상기 사용자 단말에 전송하는 것을 포함하고, 상기 기 학습된 분류 모델은 상기 적어도 하나의 메모리에 저장된 훈련 이미지 데이터에 포함된 캐릭터 객체의 통합 AU(action unit)를 상기 특징으로 하여 학습될 수 있다.In another aspect of the present disclosure, a character emotion recognition and tagging method performed by a character emotion recognition and tagging apparatus, comprising: receiving input image data including a character object from a user terminal; classifying the character object into one emotion category among a plurality of emotion categories based on a feature of the input image data and a pre-learned classification model; tagging the input image data with the one emotion category to generate tagged input image data; and transmitting the tagged input image data to the user terminal, wherein the pre-learned classification model converts an integrated action unit (AU) of character objects included in training image data stored in the at least one memory to the characteristic can be learned by

상기 통합 AU는 인간적 AU 및 캐릭터적 AU를 포함하고, 상기 인간적 AU는 상기 훈련 이미지 데이터에 포함된 캐릭터 객체의 눈썹, 눈, 코, 입 및 볼 각각에 포함된 복수의 특징점 간 거리 및 비율 중 적어도 하나를 달리 갖는 복수의 눈썹 AU, 복수의 눈 AU, 복수의 코 AU, 복수의 입 AU 및 복수의 볼 AU 중 적어도 하나를 포함하고, 및 상기 캐릭터적 AU는 상기 훈련 이미지 데이터에 포함된 배경 객체 AU 및 상기 훈련 이미지 데이터에 포함된 감정 객체 AU 중 적어도 하나를 포함할 수 있다.The integrated AU includes a human AU and a character AU, and the human AU is at least one of distances and ratios between a plurality of feature points included in each of the eyebrows, eyes, nose, mouth, and cheeks of the character object included in the training image data. At least one of a plurality of eyebrows AU, a plurality of eyes AU, a plurality of nose AUs, a plurality of mouth AUs, and a plurality of cheek AUs having one or the other, and the character AU is a background object included in the training image data It may include at least one of an AU and an emotional object AU included in the training image data.

상기 사용자 단말로부터 분류 옵션 데이터 - 상기 분류 옵션 데이터는 상기 배경 객체 AU 및 상기 감정 객체 AU 중 적어도 하나를 선택하기 위한 선택 정보를 포함 -; 를 수신하고; 상기 배경 객체 AU 및 상기 감정 객체 AU 중 상기 선택 정보에 대응되는 AU를 선택하고; 및 상기 인간적 AU, 상기 선택 정보에 대응되는 AU 및 상기 기 학습된 분류 모델에 기초하여 상기 캐릭터 객체를 상기 하나의 감정 카테고리로 분류하는 것을 더 포함할 수 있다.classification option data from the user terminal, wherein the classification option data includes selection information for selecting at least one of the background object AU and the emotion object AU; receive; select an AU corresponding to the selection information from among the background object AU and the emotion object AU; and classifying the character object into the one emotion category based on the human AU, the AU corresponding to the selection information, and the pre-learned classification model.

상기 태깅된 입력 이미지 데이터에 대한 평가 데이터를 수신하고; 및 상기 평가 데이터가 기 설정된 임계 값 미만이면, 상기 인간적 AU, 상기 배경 객체 AU 및 상기 감정 객체 AU 중 적어도 하나 및 상기 기 학습된 분류 모델에 기초하여 상기 캐릭터 객체를 상기 하나의 감정 카테고리로 분류하는 것을 더 포함할 수 있다.receive evaluation data for the tagged input image data; and classifying the character object into the one emotion category based on at least one of the human AU, the background object AU, and the emotion object AU and the pre-learned classification model when the evaluation data is less than a preset threshold. may include more.

본 개시의 또 다른 일 양상으로, 사용자 단말; 및 상기 사용자 단말로부터 수신한 입력 이미지 데이터에 포함된 캐릭터 객체의 감정을 분류하는 캐릭터 감정 인식 및 태깅 장치를 포함하고, 상기 캐릭터 감정 인식 및 태깅 장치는: 송수신기(transceiver); 적어도 하나의 프로세서; 및 상기 적어도 하나의 프로세서에 동작 가능하게 연결되어 상기 적어도 하나의 프로세서가 동작들을 수행하도록 하는 적어도 하나의 명령어들(instructions)을 저장하는 적어도 하나의 메모리(memory)를 포함하고, 상기 동작들은: 상기 송수신기를 통하여 상기 사용자 단말로부터 상기 입력 이미지 데이터를 수신하고, 상기 입력 이미지 데이터의 특징(feature) 및 기 학습된 분류 모델에 기초하여 상기 캐릭터 객체를 복수의 감정 카테고리 중 하나의 감정 카테고리로 분류하고, 상기 하나의 감정 카테고리를 상기 입력 이미지 데이터에 태깅(tagging)하여 태깅된 입력 이미지 데이터를 생성하고, 및 상기 송수신기를 통하여 상기 태깅된 입력 이미지 데이터를 상기 사용자 단말에 전송하고, 상기 기 학습된 분류 모델은 상기 적어도 하나의 메모리에 저장된 훈련 이미지 데이터에 포함된 캐릭터 객체의 통합 AU(action unit)를 상기 특징으로 하여 학습되는, 캐릭터 감정 인식 및 태깅 시스템이다.In another aspect of the present disclosure, a user terminal; and a character emotion recognition and tagging device for classifying emotions of character objects included in input image data received from the user terminal, wherein the character emotion recognition and tagging device includes: a transceiver; at least one processor; and at least one memory operatively connected to the at least one processor to store at least one instruction to cause the at least one processor to perform operations comprising: Receiving the input image data from the user terminal through a transceiver, and classifying the character object into one emotion category among a plurality of emotion categories based on features of the input image data and a pre-learned classification model; The one emotion category is tagged to the input image data to generate tagged input image data, and the tagged input image data is transmitted to the user terminal through the transceiver, and the pre-learned classification model is a character emotion recognition and tagging system that is learned based on the characteristic of an integrated action unit (AU) of a character object included in the training image data stored in the at least one memory.

상술한 본 개시의 다양한 예들은 본 개시의 바람직한 예들 중 일부에 불과하며, 본 개시의 다양한 예들의 기술적 특징들이 반영된 여러 가지 예들이 당해 기술분야의 통상적인 지식을 가진 자에 의해 이하 상술할 상세한 설명을 기반으로 도출되고 이해될 수 있다.The various examples of the present disclosure described above are only some of the preferred examples of the present disclosure, and various examples in which the technical features of the various examples of the present disclosure are reflected are detailed descriptions to be detailed below by those of ordinary skill in the art. It can be derived and understood based on.

본 개시의 다양한 예들에 따르면 다음과 같은 효과가 있다.According to various examples of the present disclosure, the following effects are obtained.

본 개시의 다양한 예들에 따르면, 콘텐츠의 업로드 및 배포 과정에서 콘텐츠에 포함된 캐릭터의 감정을 자동으로 인식 및 태깅할 수 있는 캐릭터 감정 인식 및 태깅 장치, 캐릭터 감정 인식 및 태깅 방법 및 캐릭터 감정 인식 및 태깅 장치를 포함하는 캐릭터 감정 인식 및 태깅 시스템이 제공될 수 있다.According to various examples of the present disclosure, a device for recognizing and tagging character emotions, a method for recognizing and tagging character emotions, and a character emotion recognition and tagging capable of automatically recognizing and tagging emotions of characters included in content in the process of uploading and distributing content A character emotion recognition and tagging system including an apparatus may be provided.

또한, 사용자(크리에이터) 정보를 토대로 사용자 맞춤 정보를 파악하고 캐릭터 이미지를 분석하여 최적의 감정 인식 태그의 추천이 가능하다.In addition, it is possible to recommend an optimal emotion recognition tag by identifying user-customized information based on user (creator) information and analyzing character images.

또한, 캐릭터 이미지 플랫폼에 업로드 된 콘텐츠에 대해 자동으로 태그 정보를 추천하고 감정 카테고리 분류의 정확도가 증대될 수 있다.In addition, tag information is automatically recommended for content uploaded to the character image platform, and the accuracy of emotional category classification can be increased.

또한, 콘텐츠 관리의 효율성이 향상될 수 있다.In addition, the efficiency of content management can be improved.

본 개시의 다양한 예들로부터 얻을 수 있는 효과들은 이상에서 언급된 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 이하의 상세한 설명을 기반으로 당해 기술분야에서 통상의 지식을 가진 자에게 명확하게 도출되고 이해될 수 있다.Effects obtainable from various examples of the present disclosure are not limited to the effects mentioned above, and other effects not mentioned are clearly derived to those skilled in the art based on the detailed description below and can be understood.

이하에 첨부되는 도면들은 본 개시의 다양한 예들에 관한 이해를 돕기 위한 것으로, 상세한 설명과 함께 본 개시의 다양한 예들을 제공한다. 다만, 본 개시의 다양한 예들의 기술적 특징이 특정 도면에 한정되는 것은 아니며, 각 도면에서 개시하는 특징들은 서로 조합되어 새로운 실시예로 구성될 수 있다. 각 도면에서의 참조 번호 (reference numerals) 들은 구조적 구성요소 (structural elements) 를 의미한다.
도 1은 본 개시의 일 예에 따른 캐릭터 감정 인식 및 태깅 장치의 블록도이다.
도 2 내지 도 5는 본 개시의 다양한 예들에 따라 정의되는 AU(action unit)를 설명하기 위한 것이다.
도 6은 감정 카테고리 라벨링(labeling)을 설명하기 위한 것이다.
도 7은 본 개시의 일 예에 따른 캐릭터 감정 인식 및 태깅 시스템의 개요도이다.
도 8은 본 개시의 일 예에 따른 캐릭터 감정 인식 및 태깅 방법의 흐름도이다.The accompanying drawings are provided to aid understanding of various examples of the present disclosure, and provide various examples of the present disclosure together with detailed descriptions. However, technical features of various examples of the present disclosure are not limited to specific drawings, and features disclosed in each drawing may be combined with each other to form a new embodiment. Reference numerals in each figure mean structural elements.
1 is a block diagram of a character emotion recognition and tagging device according to an example of the present disclosure.
2 to 5 are for explaining an action unit (AU) defined according to various examples of the present disclosure.
6 is for explaining emotion category labeling.
7 is a schematic diagram of a character emotion recognition and tagging system according to an example of the present disclosure.
8 is a flowchart of a character emotion recognition and tagging method according to an example of the present disclosure.

이하, 본 발명에 따른 구현들을 첨부된 도면을 참조하여 상세하게 설명한다. 첨부된 도면과 함께 이하에 개시될 상세한 설명은 본 발명의 예시적인 구현을 설명하고자 하는 것이며, 본 발명이 실시될 수 있는 유일한 구현 형태를 나타내고자 하는 것이 아니다. 이하의 상세한 설명은 본 발명의 완전한 이해를 제공하기 위해서 구체적 세부사항을 포함한다. 그러나 당업자는 본 개시가 이러한 구체적 세부사항 없이도 실시될 수 있음을 안다.Hereinafter, implementations according to the present invention will be described in detail with reference to the accompanying drawings. The detailed description set forth below in conjunction with the accompanying drawings is intended to describe exemplary implementations of the invention, and is not intended to represent the only implementations in which the invention may be practiced. The following detailed description includes specific details for the purpose of providing a thorough understanding of the present invention. However, one skilled in the art recognizes that the present disclosure may be practiced without these specific details.

몇몇 경우, 본 개시의 개념이 모호해지는 것을 피하기 위하여 공지의 구조 및 장치는 생략되거나, 각 구조 및 장치의 핵심기능을 중심으로 한 블록도 형식으로 도시될 수 있다. 또한, 본 개시 전체에서 동일한 구성요소에 대해서는 동일한 도면 부호를 사용하여 설명한다.In some cases, in order to avoid obscuring the concept of the present disclosure, well-known structures and devices may be omitted or may be shown in block diagram form centering on core functions of each structure and device. In addition, the same reference numerals are used to describe like elements throughout the present disclosure.

본 발명의 개념에 따른 다양한 예들은 다양한 변경들을 가할 수 있고 여러 가지 형태들을 가질 수 있으므로 다양한 예들을 도면에 예시하고 본 개시에 상세하게 설명하고자 한다. 그러나 이는 본 발명의 개념에 따른 다양한 예들을 특정한 개시 형태들에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Since various examples according to the concept of the present invention can be made with various changes and have various forms, various examples will be illustrated in the drawings and described in detail in the present disclosure. However, this is not intended to limit the various examples according to the concept of the present invention to specific disclosed forms, and includes modifications, equivalents, or substitutes included in the spirit and scope of the present invention.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만, 예를 들어 본 발명의 개념에 따른 권리 범위로부터 이탈되지 않은 채, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Terms such as first or second may be used to describe various components, but the components should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another component, for example, without departing from the scope of rights according to the concept of the present invention, a first component may be named a second component, Similarly, the second component may also be referred to as the first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. 구성요소들 간의 관계를 설명하는 표현들, 예를 들어 "~사이에"와 "바로~사이에" 또는 "~에 직접 이웃하는" 등도 마찬가지로 해석되어야 한다.It is understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, but other elements may exist in the middle. It should be. On the other hand, when an element is referred to as “directly connected” or “directly connected” to another element, it should be understood that no other element exists in the middle. Expressions describing the relationship between components, such as "between" and "directly between" or "directly adjacent to" should be interpreted similarly.

본 개시의 다양한 예에서, “/” 및 “,”는 “및/또는”을 나타내는 것으로 해석되어야 한다. 예를 들어, “A/B”는 “A 및/또는 B”를 의미할 수 있다. 나아가, “A, B”는 “A 및/또는 B”를 의미할 수 있다. 나아가, “A/B/C”는 “A, B 및/또는 C 중 적어도 어느 하나”를 의미할 수 있다. 나아가, “A, B, C”는 “A, B 및/또는 C 중 적어도 어느 하나”를 의미할 수 있다.In various examples of this disclosure, “/” and “,” should be interpreted as indicating “and/or”. For example, “A/B” may mean “A and/or B”. Furthermore, “A, B” may mean “A and/or B”. Furthermore, “A/B/C” may mean “at least one of A, B and/or C”. Furthermore, “A, B, C” may mean “at least one of A, B and/or C”.

본 개시의 다양한 예에서, “또는”은 “및/또는”을 나타내는 것으로 해석되어야 한다. 예를 들어, “A 또는 B”는 “오직 A”, “오직 B”, 및/또는 “A 및 B 모두”를 포함할 수 있다. 다시 말해, “또는”은 “부가적으로 또는 대안적으로”를 나타내는 것으로 해석되어야 한다.In various examples of this disclosure, “or” should be interpreted as indicating “and/or”. For example, “A or B” may include “only A”, “only B”, and/or “both A and B”. In other words, "or" should be interpreted as indicating "in addition or alternatively."

본 개시에서 사용한 용어는 단지 특정한 다양한 예들을 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 개시에서, "포함하다" 또는 "가지다" 등의 용어는 설시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in this disclosure are only used to describe specific various examples, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise. In this disclosure, the terms "comprise" or "having" are intended to designate that the described feature, number, step, operation, component, part, or combination thereof exists, but one or more other features or numbers, It should be understood that the presence or addition of steps, operations, components, parts, or combinations thereof is not precluded.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 개시에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다. 이하, 본 개시의 다양한 예들을 첨부된 도면을 참조하여 상세하게 설명한다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the present disclosure, it should not be interpreted in an ideal or excessively formal meaning. don't Hereinafter, various examples of the present disclosure will be described in detail with reference to the accompanying drawings.

도 1은 본 개시의 일 예에 따른 캐릭터 감정 인식 및 태깅 장치의 블록도이다.1 is a block diagram of a character emotion recognition and tagging device according to an example of the present disclosure.

도 1을 참조하면, 본 개시의 일 예에 따른 캐릭터 감정 인식 및 태깅 장치(100)는 송수신기(110), 적어도 하나의 프로세서(120) 및 적어도 하나의 메모리(130)를 포함한다.Referring to FIG. 1 , a character emotion recognition and tagging apparatus 100 according to an example of the present disclosure includes a transceiver 110 , at least one processor 120 and at least one memory 130 .

송수신기(110)는 프로세서(120)와 연결될 수 있고, 유/무선 신호를 송신 및/또는 수신할 수 있다. 예를 들어, 송수신기(110)는 무선 통신망을 통해 사용자 단말(200)과 연결될 수 있다. 여기서, 무선 통신망은 이동 통신망, 무선 LAN, 근거리 무선 통신망 등을 포함할 수 있다. 예를 들어, 무선 통신망은 LTE, LTE-A(LTE Advance), CDMA(code division multiple access), WCDMA(wideband CDMA), UMTS(universal mobile telecommunications system), WiBro(Wireless Broadband), 또는 GSM(Global System for Mobile Communications) 등 중 적어도 하나를 사용하는 셀룰러 통신을 포함할 수 있다. 예를 들어, 무선 통신망은 WiFi(wireless fidelity), 블루투스, 블루투스 저전력(BLE), 지그비 (Zigbee), NFC(near field communication), 또는 라디오 프리퀀시(RF) 중 적어도 하나를 포함할 수 있다.The transceiver 110 may be connected to the processor 120 and may transmit and/or receive wired/wireless signals. For example, the transceiver 110 may be connected to the user terminal 200 through a wireless communication network. Here, the wireless communication network may include a mobile communication network, a wireless LAN, a local area wireless communication network, and the like. For example, a wireless communication network may include LTE, LTE Advance (LTE-A), code division multiple access (CDMA), wideband CDMA (WCDMA), universal mobile telecommunications system (UMTS), wireless broadband (WiBro), or global system (GSM). for Mobile Communications) and the like. For example, the wireless communication network may include at least one of wireless fidelity (WiFi), Bluetooth, Bluetooth Low Energy (BLE), Zigbee, near field communication (NFC), or radio frequency (RF).

송수신기(110)는 송신기 및 수신기를 포함할 수 있다. 송수신기(110)는 RF(radio frequency) 유닛과 혼용될 수 있다. 송수신기(110)는 프로세서(120)의 제어를 통해 사용자 단말(200)과 다양한 신호를 송수신할 수 있다.The transceiver 110 may include a transmitter and a receiver. The transceiver 110 may be used interchangeably with a radio frequency (RF) unit. The transceiver 110 may transmit and receive various signals to and from the user terminal 200 under the control of the processor 120 .

프로세서(120)는 메모리(130) 및/또는 송수신기(110)를 제어하며, 본 개시의 설명, 기능, 절차, 제안, 방법 및/또는 동작 순서도들을 구현하도록 구성될 수 있다. 예를 들어, 프로세서(120)는 송수신기(110)를 통해 무선 신호를 수신하고, 무선 신호에 포함된 정보를 메모리(130)에 저장할 수 있다. 또한, 프로세서(120)는 메모리(130)에 저장된 정보를 처리하여 무선 신호를 생성한 뒤, 생성한 무선 신호를 송수신기(110)를 통해 전송할 수 있다.The processor 120 controls the memory 130 and/or the transceiver 110 and may be configured to implement the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts of the present disclosure. For example, the processor 120 may receive a radio signal through the transceiver 110 and store information included in the radio signal in the memory 130 . In addition, the processor 120 may process information stored in the memory 130 to generate a radio signal, and transmit the radio signal through the transceiver 110 .

메모리(130)는 프로세서(120)와 연결될 수 있고, 프로세서(120)의 동작과 관련한 다양한 정보를 저장할 수 있다. 예를 들어, 메모리(130)는 프로세서(120)에 의해 제어되는 프로세스들 중 일부 또는 전부를 수행하거나, 본 개시의 설명, 기능, 절차, 제안, 방법 및/또는 동작 순서도들을 수행하기 위한 명령들을 포함하는 소프트웨어 코드를 저장할 수 있다.The memory 130 may be connected to the processor 120 and may store various information related to the operation of the processor 120 . For example, memory 130 may perform some or all of the processes controlled by processor 120, or may provide instructions for performing descriptions, functions, procedures, suggestions, methods, and/or operational flowcharts of the present disclosure. It can store the software code that contains it.

이하에서는, 캐릭터 감정 인식 및 태깅 장치(100)의 다양한 동작 예들에 대하여 설명한다. 하기 다양한 동작 예들은 상술한 적어도 하나의 프로세서(120)의 동작에 포함되는 것일 수 있다.Hereinafter, various operation examples of the character emotion recognition and tagging apparatus 100 will be described. The following various operation examples may be included in the operation of at least one processor 120 described above.

프로세서(120)는, 송수신기(110)를 통하여 사용자 단말(200)로부터 캐릭터 객체를 포함하는 입력 이미지 데이터를 수신한다. 본 개시에서, 캐릭터 객체는 실사가 아닌 창작자에 의해 창작된 캐릭터나 이모티콘 등에 해당하는 객체일 수 있다. 입력 이미지 데이터는 캐릭터 객체를 포함하고, 여기에 해당 캐릭터의 감정 상태를 나타내기 위한 감정 객체 및/또는 배경 이미지에 해당하는 배경 객체를 포함할 수도 있다.The processor 120 receives input image data including a character object from the user terminal 200 through the transceiver 110 . In the present disclosure, a character object may be an object corresponding to a character or an emoticon created by a creator, not a real person. The input image data includes a character object, and may include an emotion object for representing an emotional state of a corresponding character and/or a background object corresponding to a background image.

프로세서(120)는, 사용자 단말(200)로부터 수신한 입력 이미지 데이터의 특징(feature) 및 기 학습된 분류 모델에 기초하여 입력 이미지 데이터에 포함된 캐릭터 객체를 복수의 감정 카테고리 중 하나의 감정 카테고리로 분류할 수 있다. 다시 말해서, 프로세서(120)는 입력 이미지 데이터의 특징을 기 학습된 분류 모델의 입력 데이터로 하여 캐릭터 객체의 감정을 인식 및 분류할 수 있다.The processor 120 assigns a character object included in the input image data to one emotion category among a plurality of emotion categories based on a feature of the input image data received from the user terminal 200 and a pre-learned classification model. can be classified. In other words, the processor 120 may recognize and classify the emotion of the character object by using the characteristics of the input image data as the input data of the pre-learned classification model.

캐릭터 객체의 감정 인식 및 분류를 위한 복수의 감정 카테고리는 다양하게 설정되거나 정의될 수 있다. 예를 들어, 감정 카테고리는 기쁨, 간절함, 신남, 놀람, 화남, 두려움, 슬픔, 지루함, 피곤함, 실망스러움, 부끄러움, 삐짐, 편안함, 걱정스러움, 아낌, 음흉함, 배고픔, 아픔, 당황함 및 장난스러움 등으로 설정되거나 정의될 수 있다.A plurality of emotion categories for emotion recognition and classification of character objects may be set or defined in various ways. For example, emotion categories include joy, eagerness, excitement, surprise, anger, fear, sadness, boredom, tired, disappointment, shyness, pouting, comfort, worry, generosity, sullenness, hunger, pain, embarrassment, and mischievousness. etc. can be set or defined.

상술한 예와 같은 감정 카테고리들 중 어느 하나의 감정 카테고리로 분류될지 여부는, 다양한 AI(artificial intelligence) 알고리즘에 기반하여 수행될 수 있다. 다시 말해서, 상술한 기 학습된 분류 모델의 학습을 위한 학습 알고리즘은 지도 학습(supervised learning), 비지도 학습(unsupervised learning), 강화 학습(reinforcement learning) 및 전이 학습(transfer learning) 등과 같이 다양한 기계 학습(machine learning) 알고리즘을 포함할 수 있다.Whether to be classified into any one of the emotion categories as in the above example may be performed based on various artificial intelligence (AI) algorithms. In other words, the learning algorithm for learning the above-described pre-learned classification model is a variety of machine learning such as supervised learning, unsupervised learning, reinforcement learning, and transfer learning. (machine learning) algorithms.

분류 모델의 학습을 위한 특징은 캐릭터 객체의 통합 AU(action unit)일 수 있다. 즉, 프로세서(120)는, 캐릭터 객체의 통합 AU를 특징으로 하여 분류 모델을 학습하고, 학습 결과로 기 학습된 분류 모델을 생성할 수 있다. 본 개시에서, AU는 캐릭터 객체로부터 추출되는 에지(edge) 및 에지에 포함되는 특징점(feature point) 중 적어도 하나가 이루는 하나의 요소를 의미할 수 있다.A feature for learning a classification model may be an integrated action unit (AU) of a character object. That is, the processor 120 may learn a classification model by characterizing the integrated AU of the character object, and generate a pre-learned classification model as a learning result. In the present disclosure, an AU may mean one element formed by at least one of an edge extracted from a character object and a feature point included in the edge.

이하에서는, 본 개시에 정의되는 통합 AU에 대하여 보다 구체적으로 설명한다.Hereinafter, the unified AU defined in the present disclosure will be described in more detail.

도 2 내지 도 5는 본 개시의 다양한 예들에 따라 정의되는 AU(action unit)를 설명하기 위한 것이다.2 to 5 are for explaining an action unit (AU) defined according to various examples of the present disclosure.

통합 AU는 훈련 이미지 데이터나 입력 이미지 데이터로부터 추출되는 것으로써, 인간적 AU 및 캐릭터적 AU를 포함할 수 있다. 여기서, 훈련 이미지 데이터는 분류 모델의 학습을 위해 사용되는 이미지 데이터로써, 메모리(130)에 기 저장되어 있거나 또는 사용자 단말(200) 외 다른 사용자 기기로부터 수신하는 것일 수 있다. The unified AU is extracted from training image data or input image data, and may include a human AU and a character AU. Here, the training image data is image data used for learning the classification model, and may be pre-stored in the memory 130 or received from a user device other than the user terminal 200 .

인간적 AU는 얼굴의 이목구비 등과 같이 표정을 나타내는 요소에 대응되는 것일 수 있고, 캐릭터적 AU는 얼굴의 표정과는 관련 없으나 직간접적으로 캐릭터의 감정을 나타내는 요소에 대응되는 것이며, 편의상 각각 인간적 AU 및 캐릭터적 AU로 칭해질 수 있다.The human AU may correspond to elements representing facial expressions, such as facial features, and the character AU corresponds to elements that directly or indirectly represent the emotions of a character, although not related to facial expressions. For convenience, human AU and character may be referred to as the enemy AU.

도 2를 참조하면, 인간적 AU는 분류 모델의 학습을 위한 훈련 이미지 데이터나 입력 이미지 데이터에 포함된 캐릭터 객체의 눈썹, 눈, 코, 입 및 볼 각각에 포함된 복수의 특징점 간 거리 및 비율 중 적어도 하나를 달리 갖는 복수의 눈썹 AU, 복수의 눈 AU, 복수의 코 AU, 복수의 입 AU 및 복수의 볼 AU 중 적어도 하나를 포함할 수 있다.Referring to FIG. 2, the human AU is at least one of distances and ratios between a plurality of feature points included in each of the eyebrows, eyes, nose, mouth, and cheeks of a character object included in training image data or input image data for learning a classification model. It may include at least one of a plurality of eyebrow AUs, a plurality of eye AUs, a plurality of nose AUs, a plurality of mouth AUs, and a plurality of cheek AUs, which otherwise have one.

눈썹 AU, 눈 AU, 코 AU, 입 AU 및 볼 AU는 캐릭터 객체의 눈썹, 눈, 코, 입 및 볼 각각에 포함된 복수의 특징점 및 에지 중 적어도 하나를 포함하는 것일 수 있다. 눈썹 AU, 눈 AU, 코 AU, 입 AU 및 볼 AU는 감정에 따라 다양하게 설정되거나 정의될 수 있다.The eyebrow AU, eye AU, nose AU, mouth AU, and cheek AU may include at least one of a plurality of feature points and edges included in each of the eyebrows, eyes, nose, mouth, and cheeks of the character object. Eyebrow AU, eye AU, nose AU, mouth AU, and cheek AU may be set or defined in various ways according to emotions.

도 3을 참조하면, 예를 들어, 슬픔 감정 카테고리에 해당하는 캐릭터 객체는 찡그러진 눈썹 AU(311), 감은 눈 AU(321), 커진 코 AU(331) 및/또는 M자형 입 AU(341) 등을 포함할 수 있다.Referring to FIG. 3 , for example, a character object corresponding to the sadness emotion category includes a frowning eyebrow AU 311, a closed eye AU 321, an enlarged nose AU 331, and/or an M-shaped mouth AU 341. etc. may be included.

예를 들어, 화남 감정 카테고리에 해당하는 캐릭터 객체는 화난 눈 AU(322), 커진 코 AU(332) 및/또는 벌린 입 AU(342) 등을 포함할 수 있다.For example, a character object corresponding to the angry emotion category may include an angry eye AU 322 , an enlarged nose AU 332 , and/or an open mouth AU 342 .

예를 들어, 기쁨 감정 카테고리에 해당하는 캐릭터 객체는 웃는 눈 AU(323) 및/또는 U자형 입 AU(343) 등을 포함할 수 있다.For example, a character object corresponding to the joy emotion category may include a smiling eye AU 323 and/or a U-shaped mouth AU 343.

도 4 및 도 5를 참조하면, 캐릭터적 AU는 분류 모델의 학습을 위한 훈련 이미지 데이터나 입력 이미지 데이터에 포함된 배경 객체 AU 및 감정 객체 AU 중 적어도 하나를 포함할 수 있다. 배경 객체 AU 및 감정 객체 AU는 각각 상술한 배경 객체 및 감정 객체 각각에 포함된 복수의 특징점 및 에지 중 적어도 하나를 포함하는 것일 수 있다. 배경 객체 AU 및 감정 객체 AU는 감정에 따라 다양하게 설정되거나 정의될 수 있다.Referring to FIGS. 4 and 5 , the character AU may include at least one of a background object AU and an emotion object AU included in training image data for learning a classification model or input image data. Each of the background object AU and the emotion object AU may include at least one of a plurality of feature points and edges included in each of the above-described background object and emotion object. The background object AU and the emotion object AU may be set or defined in various ways according to emotions.

보다 구체적으로, 감정 객체 AU는 훈련 이미지나 입력 이미지 데이터에서 캐릭터 객체를 제외한 나머지 객체 중 캐릭터 객체의 에지와 적어도 일부가 중첩되는 에지를 포함하는 객체에 대응되는 것일 수 있다. 또한, 배경 객체 AU는 훈련 이미지 데이터 나 입력 이미지 데이터에서 캐릭터 객체 및 감정 객체 AU를 제외한 나머지 객체에 대응되는 것일 수 있다.More specifically, the emotion object AU may correspond to an object including an edge at least partially overlapping an edge of a character object among other objects other than the character object in the training image or input image data. Also, the background object AU may correspond to objects other than the character object and the emotion object AU in the training image data or the input image data.

예를 들어, 감정 객체 AU는 눈물 AU(411), 화남 표시 AU(412), 불남 AU(413) 및 땀방울 AU(414) 등을 포함할 수 있다.For example, the emotion object AU may include a tear AU 411 , an anger display AU 412 , an indifference AU 413 , and a sweat drop AU 414 .

예를 들어, 배경 객체 AU는 꽃 문양 AU(421) 및 텍스트 AU(422) 등을 포함할 수 있다. 여기서, 텍스트 AU(422)는 예를 들어 기쁨 감정 카테고리에 해당하는 경우 도시된 바와 같이 'I`m so happy'와 같이 기쁨을 나타내는 텍스트를 포함할 수 있다.For example, the background object AU may include a flower pattern AU 421 and a text AU 422. Here, the text AU 422 may include, for example, text indicating joy, such as 'I'm so happy', as shown in the case of a joy emotion category.

상술한 통합 AU들은 예시적인 것에 불과하고, 그 밖에 각 감정 카테고리 별로 해당 감정을 효과적으로 나타낼 수 있는 다양한 요소들이 통합 AU로써 설정되거나 정의될 수 있다.The above-mentioned unified AUs are merely exemplary, and various other factors capable of effectively representing corresponding emotions for each emotion category may be set or defined as the unified AU.

도 6은 감정 카테고리 라벨링(labeling)을 설명하기 위한 것이다.6 is for explaining emotion category labeling.

도 6을 참조하면, 상술한 통합 AU 각각은 복수의 감정 카테고리 중 적어도 하나에 라벨링(labeling)될 수도 있다. 이러한 경우, 훈련 이미지 데이터는 예를 들어 (감정 카테고리, 통합 AU) 쌍이 될 수 있다. 즉, 분류 모델의 학습 시 감정 카테고리에 라벨링되어 있는 통합 AU를 훈련 이미지 데이터로 하는 지도 학습이 사용될 수 있다.Referring to FIG. 6 , each of the aforementioned unified AUs may be labeled with at least one of a plurality of emotional categories. In this case, the training image data may be, for example, (emotion category, unified AU) pairs. That is, when learning a classification model, supervised learning using an integrated AU labeled in an emotion category as training image data may be used.

프로세서(120)는, 분류된 하나의 감정 카테고리를 입력 이미지 데이터에 태깅(tagging)하여 태깅된 입력 이미지 데이터를 생성할 수 있다. 여기서, 태깅된 입력 이미지 데이터는 입력 이미지 데이터와 분류된 하나의 감정 카테고리 정보를 함께 포함하는 것일 수 있다.The processor 120 may generate tagged input image data by tagging the input image data with one classified emotion category. Here, the tagged input image data may include both the input image data and one classified emotion category information.

프로세서(120)는, 송수신기(110)를 통하여 생성된 태깅된 입력 이미지 데이터를 사용자 단말(200)에 전송할 수 있다.The processor 120 may transmit the tagged input image data generated through the transceiver 110 to the user terminal 200 .

상술한 캐릭터 감정 인식 및 태깅 장치(100)에 따르면, 실사가 아닌 특히 캐릭터나 이모티콘과 같은 창작물에 포함된 객체의 감정이 효과적으로 인식 및 분류될 수 있다. 구체적으로, 캐릭터 감정 인식 및 태깅 장치(100)는 기계 학습과 같은 AI 알고리즘 시에 사용되는 특징에 캐릭터적 요소를 고려함으로써 캐릭터 객체의 감정을 보다 정밀하게 인식 및 분류할 수 있다.According to the character emotion recognition and tagging apparatus 100 described above, emotions of objects included in creative works such as characters or emoticons can be effectively recognized and classified. Specifically, the apparatus 100 for recognizing and tagging character emotions may more accurately recognize and classify emotions of character objects by considering character elements in features used in AI algorithms such as machine learning.

또는, 본 개시의 다른 일 예에 따르면 캐릭터 감정 인식 및 태깅 장치(100)는 입력 이미지 데이터의 감정 분류 시 감정 분류에 사용할 특징을 선택할 수도 있다. 여기서, 선택되는 특징은 캐릭터적 AU 중 어느 하나일 수 있다.Alternatively, according to another example of the present disclosure, the apparatus 100 for recognizing and tagging character emotions may select a feature to be used for emotion classification when emotion classification of input image data. Here, the selected feature may be any one of character AUs.

구체적으로, 프로세서(120)는, 송수신기(110)를 통하여 사용자 단말(200)로부터 분류 옵션 데이터를 수신할 수 있다. 여기서, 분류 옵션 데이터는 배경 객체 AU 및 감정 객체 AU 중 적어도 하나를 선택하기 위한 선택 정보일 수 있다.Specifically, the processor 120 may receive classification option data from the user terminal 200 through the transceiver 110 . Here, the classification option data may be selection information for selecting at least one of the background object AU and the emotion object AU.

프로세서(120)는, 배경 객체 AU 및 상기 감정 객체 AU 중 수신한 선택 정보에 대응되는 AU를 선택할 수 있다.The processor 120 may select an AU corresponding to the received selection information from among the background object AU and the emotion object AU.

프로세서(120)는, 인간적 AU, 수신한 선택 정보에 대응되는 AU 및 기 학습된 분류 모델에 기초하여 캐릭터 객체를 하나의 감정 카테고리로 분류할 수 있다.The processor 120 may classify the character object into one emotion category based on the human AU, the AU corresponding to the received selection information, and the pre-learned classification model.

이에 따라, 본 개시의 다른 일 예에 따른 캐릭터 감정 인식 및 태깅 장치(100)는 창작자에게 감정 인식 및 분류를 위한 특징을 선택하게 할 수 있고, 선택된 특징에 기반하여 감정 인식 및 분류를 수행함으로써 기계 학습의 만족도를 올릴 수 있다.Accordingly, the character emotion recognition and tagging device 100 according to another example of the present disclosure may allow a creator to select a feature for emotion recognition and classification, and perform emotion recognition and classification based on the selected feature, thereby enabling a machine to perform emotion recognition and classification. can increase learning satisfaction.

여기에, 추가적으로 프로세서(120)는 콘텐츠의 시장 조사를 위하여, 분류 및 태깅된 감정 카테고리에 해당하는 유사 이미지를 검색하고, 검색된 유사 이미지를 송수신기(110)를 통해 사용자 단말(200)에 전송할 수도 있다.Here, the processor 120 may additionally search for similar images corresponding to emotional categories classified and tagged for market research of content, and transmit the searched similar images to the user terminal 200 through the transceiver 110. .

또한, 프로세서(120)는 검색된 유사 이미지와 업로드된 이미지 간 유사도 검사를 통해 콘텐츠 저작권의 위배 여부를 판단할 수도 있다. 이에 따라, 만약 유사도 검사를 통해 콘텐츠 저작권이 위배되는 것으로 판단되면, 프로세서(120)는 송수신기(110)를 통해 사용자 단말(200)에 별도의 경고 메시지를 전송할 수도 있다.In addition, the processor 120 may determine whether content copyright is violated by checking the similarity between the searched similar image and the uploaded image. Accordingly, if it is determined that content copyright is violated through the similarity test, the processor 120 may transmit a separate warning message to the user terminal 200 through the transceiver 110 .

또는, 본 개시의 또 다른 일 예에 따르면 캐릭터 감정 인식 및 태깅 장치(100)는 분류된 감정의 평가도에 따라 특징을 추가하여 재분류를 수행할 수도 있다.Alternatively, according to another example of the present disclosure, the apparatus 100 for recognizing and tagging character emotions may perform reclassification by adding features according to the evaluation degree of classified emotions.

구체적으로, 프로세서(120)는, 송수신기(110)를 통하여 태깅된 입력 이미지 데이터에 대한 평가 데이터를 수신할 수 있다. 여기서, 평가 데이터는 1차적으로 분류된 감정에 대한 사용자의 만족도를 수치화한 데이터일 수 있다.Specifically, the processor 120 may receive evaluation data for tagged input image data through the transceiver 110 . Here, the evaluation data may be data obtained by quantifying the user's satisfaction with the primarily classified emotion.

프로세서(120)는, 수신한 평가 데이터가 기 설정된 임계 값 미만이면, 인간적 AU, 배경 객체 AU 및 감정 객체 AU 중 적어도 하나 및 기 학습된 분류 모델에 기초하여 캐릭터 객체를 하나의 감정 카테고리로 분류할 수 있다. 예를 들어, 평가 데이터에 대응되는 평가 수치의 최대값이 10점이고, 기 설정된 임계 값이 5점이고, 수신한 평가 데이터에 대응되는 평가 수치의 값이 4점인 경우, 프로세서(120)는 인간적 AU, 배경 객체 AU 및 감정 객체 AU 중 적어도 하나를 특징으로 하여 감정 분류를 다시 수행할 수 있다.If the received evaluation data is less than a preset threshold, the processor 120 classifies the character object into one emotion category based on at least one of a human AU, a background object AU, and an emotion object AU and a pre-learned classification model. can For example, when the maximum value of the evaluation value corresponding to the evaluation data is 10 points, the predetermined threshold value is 5 points, and the value of the evaluation value corresponding to the received evaluation data is 4 points, the processor 120 determines the human AU, Emotion classification may be performed again by characterizing at least one of the background object AU and the emotion object AU.

상술한 본 개시의 또 다른 일 예에 따르면 캐릭터 감정 인식 및 태깅 장치(100)는 특히 1차적으로 수행되는 감정 분류가 오직 인간적 AU만을 특징으로 한 경우에 유용할 수 있다. 즉, 캐릭터 감정 인식 및 태깅 장치(100)가 1차적으로는 인간적 AU를 특징으로 하여 감정 분류를 수행하였다가, 분류된 감정에 대한 사용자의 만족도가 낮은 경우에는 캐릭터적 AU를 함께 특징으로 하여 감정 분류를 재수행함으로써 사용자의 만족도를 올릴 수 있다.According to another example of the present disclosure described above, the apparatus 100 for recognizing and tagging character emotions may be particularly useful when primarily performed emotion classification is characterized by only human AUs. That is, the character emotion recognition and tagging apparatus 100 primarily performs emotion classification by characterizing the human AU, and then, when the user's satisfaction with the classified emotion is low, character emotion recognition and tagging apparatus 100 characterizes the character AU together and performs emotion classification. By performing the classification again, user satisfaction can be increased.

도 7은 본 개시의 일 예에 따른 캐릭터 감정 인식 및 태깅 시스템의 개요도이다.7 is a schematic diagram of a character emotion recognition and tagging system according to an example of the present disclosure.

이하에서는, 앞서 설명한 부분과 중복되는 부분에 대한 상세한 설명은 생략한다.Hereinafter, detailed descriptions of overlapping parts with those described above will be omitted.

도 7을 참조하면, 본 개시의 일 예에 따른 캐릭터 감정 인식 및 태깅 시스템은 캐릭터 감정 인식 및 태깅 장치(100) 및 사용자 단말(200)을 포함한다.Referring to FIG. 7 , a character emotion recognition and tagging system according to an example of the present disclosure includes a character emotion recognition and tagging device 100 and a user terminal 200 .

캐릭터 감정 인식 및 태깅 장치(100)는 사용자 단말(200)로부터 수신한 입력 이미지 데이터에 포함된 캐릭터 객체의 감정을 분류한다.The character emotion recognition and tagging apparatus 100 classifies emotions of character objects included in input image data received from the user terminal 200 .

사용자 단말(200)은 캐릭터 감정 인식 및 태깅 장치(100)와 데이터를 주고받을 수 있다. 예를 들어, 사용자 단말(200)은 감정 분류를 위한 입력 이미지 데이터를 전송할 수 있다. 이때, 사용자 단말(200)은 캐릭터의 이미지 스타일, 색상, 형태 및 그림체 등과 같은 디자인 정보를 입력하고, 해당 디자인 정보도 함께 전송할 수 있다. 예를 들어, 이미지 스타일은 정적 이미지(still image)나 동적 이미지(moving image)를 포함할 수 있다.The user terminal 200 may exchange data with the character emotion recognition and tagging device 100 . For example, the user terminal 200 may transmit input image data for emotion classification. At this time, the user terminal 200 may input design information such as the image style, color, shape, and drawing style of the character and transmit the corresponding design information together. For example, the image style may include a still image or a moving image.

캐릭터 감정 인식 및 태깅 장치(100)는 사용자 단말(200)로부터 디자인 정보를 수신하면, 수신한 디자인 정보를 입력 이미지 데이터에 매칭시켜 메모리(130)에 저장할 수 있다.When design information is received from the user terminal 200 , the character emotion recognition and tagging apparatus 100 may match the received design information to input image data and store it in the memory 130 .

도 8은 본 개시의 일 예에 따른 캐릭터 감정 인식 및 태깅 방법의 흐름도이다.8 is a flowchart of a character emotion recognition and tagging method according to an example of the present disclosure.

도 8을 참조하면, S110에서, 캐릭터 감정 인식 및 태깅 장치(100)는 사용자 단말(200)로부터 캐릭터 객체를 포함하는 입력 이미지 데이터를 수신할 수 있다.Referring to FIG. 8 , in S110 , the apparatus 100 for recognizing and tagging character emotions may receive input image data including a character object from the user terminal 200 .

S120에서, 캐릭터 감정 인식 및 태깅 장치(100)는 입력 이미지 데이터의 특징 및 기 학습된 분류 모델에 기초하여 캐릭터 객체를 복수의 감정 카테고리 중 하나의 감정 카테고리로 분류할 수 있다. 여기서, 기 학습된 분류 모델은 훈련 이미지 데이터에 포함된 캐릭터 객체의 통합 AU를 특징으로 하여 학습되는 것일 수 있다.In operation S120 , the apparatus 100 for recognizing and tagging character emotions may classify the character object into one emotion category among a plurality of emotion categories based on the characteristics of the input image data and the pre-learned classification model. Here, the pre-learned classification model may be learned by characterizing the integrated AU of the character object included in the training image data.

통합 AU는 인간적 AU 및 캐릭터적 AU를 포함할 수 있다. 이때, 인간적 AU는 훈련 이미지 데이터에 포함된 캐릭터 객체의 눈썹, 눈, 코, 입 및 볼 각각에 포함된 복수의 특징점 간 거리 및 비율 중 적어도 하나를 달리 갖는 복수의 눈썹 AU, 복수의 눈 AU, 복수의 코 AU, 복수의 입 AU 및 복수의 볼 AU 중 적어도 하나를 포함할 수 있다.The unified AU may include a human AU and a character AU. In this case, the human AU includes a plurality of eyebrow AUs, a plurality of eye AUs, having at least one of the distances and ratios between the plurality of feature points included in each of the eyebrows, eyes, nose, mouth, and cheeks of the character object included in the training image data. It may include at least one of a plurality of nose AUs, a plurality of mouth AUs, and a plurality of cheek AUs.

캐릭터적 AU는 훈련 이미지 데이터에 포함된 배경 객체 AU 및 훈련 이미지 데이터에 포함된 감정 객체 AU 중 적어도 하나를 포함할 수 있다. 이때, 감정 객체 AU는 훈련 이미지 데이터에서 캐릭터 객체를 제외한 나머지 객체 중 캐릭터 객체의 에지와 적어도 일부가 중첩되는 에지를 포함하는 객체에 대응될 수 있다. 배경 객체 AU는 훈련 이미지 데이터에서 캐릭터 객체 및 감정 객체 AU를 제외한 나머지 객체에 대응될 수 있다.The character AU may include at least one of a background object AU included in the training image data and an emotion object AU included in the training image data. In this case, the emotion object AU may correspond to an object including an edge at least partially overlapping an edge of a character object among other objects other than the character object in the training image data. The background object AU may correspond to objects other than the character object and the emotion object AU in the training image data.

S130에서, 캐릭터 감정 인식 및 태깅 장치(100)는 하나의 감정 카테고리를 입력 이미지 데이터에 태깅하여 태깅된 입력 이미지 데이터를 생성할 수 있다.In operation S130 , the apparatus 100 for recognizing and tagging character emotions may tag input image data with one emotion category to generate tagged input image data.

S140에서, 캐릭터 감정 인식 및 태깅 장치(100)는 태깅된 입력 이미지 데이터를 사용자 단말(200)에 전송할 수 있다.In S140 , the apparatus 100 for recognizing and tagging character emotions may transmit the tagged input image data to the user terminal 200 .

여기에, 본 개시의 다른 일 예에 따른 캐릭터 감정 인식 및 태깅 방법은 감정 분류에 사용할 특징을 선택하기 위한 단계를 더 포함할 수도 있다.Here, the character emotion recognition and tagging method according to another example of the present disclosure may further include a step of selecting a feature to be used for emotion classification.

예를 들어, 캐릭터 감정 인식 및 태깅 방법은 캐릭터 감정 인식 및 태깅 장치(100)가 사용자 단말(200)로부터 분류 옵션 데이터를 수신하는 단계, 배경 객체 AU 및 감정 객체 AU 중 선택 정보에 대응되는 AU를 선택하는 단계 및 인간적 AU, 선택 정보에 대응되는 AU 및 기 학습된 분류 모델에 기초하여 캐릭터 객체를 하나의 감정 카테고리로 분류하는 단계를 더 포함할 수 있다.For example, the method for recognizing and tagging character emotions includes receiving classification option data from the user terminal 200 by the apparatus 100 for recognizing and tagging character emotions, selecting an AU corresponding to selection information among a background object AU and an emotion object AU. The step of selecting and classifying the character object into one emotion category based on the human AU, the AU corresponding to the selection information, and the pre-learned classification model may be further included.

이때, 분류 옵션 데이터는 배경 객체 AU 및 감정 객체 AU 중 적어도 하나를 선택하기 위한 선택 정보를 포함할 수 있다.In this case, the classification option data may include selection information for selecting at least one of the background object AU and the emotion object AU.

여기에, 본 개시의 또 다른 일 예에 따른 캐릭터 감정 인식 및 태깅 방법은 감정 분류에 사용할 특징을 선택하기 위한 단계를 더 포함할 수도 있다.Here, the character emotion recognition and tagging method according to another example of the present disclosure may further include a step of selecting a feature to be used for emotion classification.

예를 들어, 캐릭터 감정 인식 및 태깅 방법은 캐릭터 감정 인식 및 태깅 장치(100)가 태깅된 입력 이미지 데이터에 대한 평가 데이터를 수신하는 단계, 평가 데이터가 기 설정된 임계 값 미만이면, 인간적 AU, 배경 객체 AU 및 감정 객체 AU 중 적어도 하나 및 기 학습된 분류 모델에 기초하여 캐릭터 객체를 하나의 감정 카테고리로 분류하는 단계를 더 포함할 수 있다.For example, the method for recognizing and tagging character emotions includes receiving evaluation data for tagged input image data by the apparatus 100 for recognizing and tagging character emotions, and if the evaluation data is less than a preset threshold, human AU and background object The method may further include classifying the character object into one emotion category based on at least one of the AU and the emotion object AU and a pre-learned classification model.

상술한 설명에서 제안 방식에 대한 일례들 또한 본 개시의 구현 방법들 중 하나로 포함될 수 있으므로, 일종의 제안 방식들로 간주될 수 있음은 명백한 사실이다. 또한, 상기 설명한 제안 방식들은 독립적으로 구현될 수 도 있지만, 일부 제안 방식들의 조합 (혹은 병합) 형태로 구현될 수 도 있다. Since the examples of the proposed schemes in the above description may also be included as one of the implementation methods of the present disclosure, it is obvious that they can be regarded as a kind of proposed schemes. In addition, the above-described proposed schemes may be implemented independently, but may also be implemented in a combination (or merged) form of some proposed schemes.

상술한 바와 같이 개시된 본 개시의 예들은 본 개시와 관련된 기술분야의 통상의 기술자가 본 개시를 구현하고 실시할 수 있도록 제공되었다. 상기에서는 본 개시의 예들을 참조하여 설명하였지만, 해당 기술 분야의 통상의 기술자는 본 개시의 예들을 다양하게 수정 및 변경시킬 수 있다. 따라서, 본 개시는 여기에 기재된 예들에 제한되려는 것이 아니라, 여기서 개시된 원리들 및 신규한 특징들과 일치하는 최광의 범위를 부여하려는 것이다.The examples of the present disclosure disclosed as described above are provided to enable those skilled in the art to implement and practice the present disclosure. Although described above with reference to examples of the present disclosure, a person skilled in the art may variously modify and change the examples of the present disclosure. Thus, the present disclosure is not intended to be limited to the examples set forth herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

100: 감정 인식 및 태깅 장치
110: 송수신기 120: 프로세서
130: 메모리
200: 사용자 단말100: emotion recognition and tagging device
110: transceiver 120: processor
130: memory
200: user terminal

Claims

transceiver;
at least one processor; and
at least one memory operatively connected to the at least one processor to store at least one instruction that causes the at least one processor to perform operations;
The above actions are:
Receiving input image data including a character object from a user terminal through the transceiver,
Classifying the character object into one emotion category among a plurality of emotion categories based on a feature of the input image data and a pre-learned classification model;
tagging the input image data with the one emotion category to generate tagged input image data; and
Transmitting the tagged input image data to the user terminal through the transceiver;
The pre-learned classification model is learned by characterizing an integrated action unit (AU) of a character object included in the training image data stored in the at least one memory,
The integrated AU includes a human AU and a character AU,
The human AU includes a plurality of eyebrow AUs, a plurality of eye AUs having different distances and ratios of at least one of a plurality of feature points included in each of the eyebrows, eyes, nose, mouth, and cheeks of the character object included in the training image data; At least one of a plurality of nose AUs, a plurality of mouth AUs, and a plurality of cheek AUs;
The character AU includes at least one of a background object AU included in the training image data and an emotion object AU included in the training image data,
The above actions are:
classifying the human AU among the integrated AUs into the one emotion category as the characteristic;
receiving evaluation data for the tagged input image data through the transceiver; and
Further comprising classifying the character object into the one emotion category based on the human AU, the character AU, and the pre-learned classification model if the evaluation data is less than a preset threshold value.
Character emotion recognition and tagging device.

delete

According to claim 1,
The emotion object AU corresponds to an object including an edge at least partially overlapping an edge of the character object among objects other than the character object in the training image data, and
The background object AU corresponds to objects other than the character object and the emotion object AU in the training image data.
Character emotion recognition and tagging device.

According to claim 1,
The above actions are:
classification option data from the user terminal through the transceiver, wherein the classification option data includes selection information for selecting at least one of the background object AU and the emotion object AU; receive,
selecting an AU corresponding to the selection information from among the background object AU and the emotion object AU; and
Further comprising classifying the character object into the one emotion category based on the human AU, the AU corresponding to the selection information, and the pre-learned classification model.
Character emotion recognition and tagging device.

delete

According to claim 1,
Each of the integrated AUs is labeled with at least one of the plurality of emotional categories,
Character emotion recognition and tagging device.

A character emotion recognition and tagging method performed by a character emotion recognition and tagging device,
Receiving input image data including a character object from a user terminal;
classifying the character object into one emotion category among a plurality of emotion categories based on a feature of the input image data and a pre-learned classification model;
tagging the input image data with the one emotion category to generate tagged input image data; and
Transmitting the tagged input image data to the user terminal,
The pre-learned classification model is learned by characterizing an integrated action unit (AU) of a character object included in training image data,
The integrated AU includes a human AU and a character AU,
The human AU includes a plurality of eyebrow AUs, a plurality of eye AUs having different distances and ratios of at least one of a plurality of feature points included in each of the eyebrows, eyes, nose, mouth, and cheeks of the character object included in the training image data; At least one of a plurality of nose AUs, a plurality of mouth AUs, and a plurality of cheek AUs;
The character AU includes at least one of a background object AU included in the training image data and an emotion object AU included in the training image data,
The character emotion recognition and tagging method:
classifying the human AU among the integrated AUs as the one emotion category;
receive evaluation data for the tagged input image data; and
Further comprising classifying the character object into the one emotion category based on the human AU, the character AU, and the pre-learned classification model if the evaluation data is less than a preset threshold value.
Character emotion recognition and tagging methods.

delete

According to claim 9,
The emotion object AU corresponds to an object including an edge at least partially overlapping an edge of the character object among objects other than the character object in the training image data, and
The background object AU corresponds to objects other than the character object and the emotion object AU in the training image data.
Character emotion recognition and tagging methods.

According to claim 9,
classification option data from the user terminal, wherein the classification option data includes selection information for selecting at least one of the background object AU and the emotion object AU; receive;
select an AU corresponding to the selection information from among the background object AU and the emotion object AU; and
Further comprising classifying the character object into the one emotion category based on the human AU, the AU corresponding to the selection information, and the pre-learned classification model.
Character emotion recognition and tagging methods.

delete

According to claim 9,
Each of the integrated AUs is labeled with at least one of the plurality of emotional categories,
Character emotion recognition and tagging methods.

user terminal; and
A character emotion recognition and tagging device for classifying the emotion of a character object included in the input image data received from the user terminal,
The character emotion recognition and tagging device:
transceiver;
at least one processor; and
at least one memory operatively connected to the at least one processor to store at least one instruction that causes the at least one processor to perform operations;
The above actions are:
Receiving the input image data from the user terminal through the transceiver;
Classifying the character object into one emotion category among a plurality of emotion categories based on a feature of the input image data and a pre-learned classification model;
tagging the input image data with the one emotion category to generate tagged input image data; and
Transmitting the tagged input image data to the user terminal through the transceiver;
The pre-learned classification model is learned by characterizing an integrated action unit (AU) of a character object included in the training image data stored in the at least one memory,
The integrated AU includes a human AU and a character AU,
The human AU includes a plurality of eyebrow AUs, a plurality of eye AUs having different distances and ratios of at least one of a plurality of feature points included in each of the eyebrows, eyes, nose, mouth, and cheeks of the character object included in the training image data; At least one of a plurality of nose AUs, a plurality of mouth AUs, and a plurality of cheek AUs;
The character AU includes at least one of a background object AU included in the training image data and an emotion object AU included in the training image data,
The above actions are:
classifying the human AU among the integrated AUs into the one emotion category as the characteristic;
receiving evaluation data for the tagged input image data through the transceiver; and
Further comprising classifying the character object into the one emotion category based on the human AU, the character AU, and the pre-learned classification model if the evaluation data is less than a preset threshold value.
Character emotion recognition and tagging system.