KR20200117712A

KR20200117712A - Artificial intelligence smart speaker capable of sharing feeling and emotion between speaker and user

Info

Publication number: KR20200117712A
Application number: KR1020190040231A
Authority: KR
Inventors: 전상원; 오정석; 송나영; 구세인; 박영춘; 이중희; 김윤진
Original assignee: 주식회사 로봇앤모어
Priority date: 2019-04-05
Filing date: 2019-04-05
Publication date: 2020-10-14

Abstract

According to an embodiment of the present invention, an artificial intelligence smart speaker capable of communicating with a user, which constructs an image database for learning by paralleling various commercial video/ image databases and a learning video/ image database customized to the user, and obtains an answer from a QA database for emotion recognition and image recognition through an image/image learned database and uses the answer in an expression technique, and after classifying questions and commands in a voice query classifier for voice recognized through a voice recognition system and responding to orders is taken, and takes action in response to orders and, for questions, answers the corresponding questions through the established QA database such as video/image to derive an optimal solution by interlocking with a commercial recognition system.

Description

Artificial intelligence smart speaker capable of communicating with users {ARTIFICIAL INTELLIGENCE SMART SPEAKER CAPABLE OF SHARING FEELING AND EMOTION BETWEEN SPEAKER AND USER}

본 발명은 인공지능 스마트 스피커에 관한 것으로서, 좀 더 구체적으로는, 종래 AI 스피커 제품과 달리 단순한 단어 인식을 통한 기능 구현을 넘어서서, 스마트 스피커와 사용자 사이에 교감이 가능한 인공지능 스마트 스피커에 관한 것이다.The present invention relates to an artificial intelligence smart speaker, and more specifically, to an artificial intelligence smart speaker capable of communicating between a smart speaker and a user, beyond the realization of functions through simple word recognition unlike conventional AI speaker products.

도 1은 종래의 스마트 스피커의 대표적인 예인 아마존 에코를 나타낸 것이다. 알렉사(Alexa)는 아마존에서 개발한 인공지능 플랫폼으로, 아마존 에코(Echo)에 처음 사용되었다. 사용자는 아마존 에코를 이용해 알렉사와 의사소통을 할 수 있으며, 알렉사는 음악재생, 알람설정, 날씨정보 제공, 교통정보 제공 등 많은 기능들을 제공해준다. 알렉사 이외에도 다양한 스마트 스피커가 출시되어 있다. 1 shows an Amazon Echo that is a representative example of a conventional smart speaker. Alexa is an artificial intelligence platform developed by Amazon and was first used by Amazon Echo. Users can communicate with Alexa using Amazon Echo, and Alexa provides many functions such as music playback, alarm settings, weather information, and traffic information. In addition to Alexa, various smart speakers are available.

이와 같은 스마트 스피커와 관련한 종래의 특허문헌으로는, 대한민국 공개특허 제10-2019-0012708호(2019.02.11. 공개, 발명의 명칭 : 음성 인식 및 대화가 가능한 음식 주문용 스마트 스피커 및 이를 이용한 음식 주문 서비스 플랫폼(SMART SPEAKER FOR FOOD ORDER AND FOOD ORDER SERVICE PLATFORM USING THEREOF)를 들 수 있다. As a conventional patent document related to such a smart speaker, Korean Patent Publication No. 10-2019-0012708 (published on February 11, 2019, title of invention: smart speaker for food ordering capable of voice recognition and conversation, and food order using the same The service platform (SMART SPEAKER FOR FOOD ORDER AND FOOD ORDER SERVICE PLATFORM USING THEREOF) is mentioned.

해당 특허문헌에서는, "사용자의 음성을 인식한 후 음성패턴분석을 통해 맞춤형 음식종류, 맞춤형 음식점, 맞춤형 레시피 및 식단 등을 추천할 수 있고, 각종 외부 데이터 베이스 및 SNS 서비스와의 연동을 통해 보다 정확한 맞춤형 음식 정보를 제공할 수 있으며, 특히 음성 인식 기술을 기반으로 음식의 주문 및 결제까지 가능하도록 하는 음성 인식 및 대화가 가능한 음식 주문용 스마트 스피커 및 이를 이용한 음식 주문 서비스 플랫폼에 관한 기술"이 개시되어 있다. In the patent document, "After recognizing the user's voice, it is possible to recommend customized food types, customized restaurants, customized recipes and diets through speech pattern analysis, and more accurate through linking with various external databases and SNS services. "Technology on a food ordering service platform using the same, a smart speaker for food ordering capable of speech recognition and conversation that can provide customized food information, and in particular, enables food ordering and payment based on voice recognition technology" have.

또 다른 특허문헌으로는, 대한민국 특허 제10-1619274호(2016.05.02. 등록, 발명의 명칭 : 모션에 따른 기능을 수행하는 스마트스피커 및 이를 이용한 음향시스템(SMART SPEAKER AND SOUND SYSTEM USING THE SAME))를 들 수 있다. As another patent document, Korean Patent No. 10-1619274 (registered on May 2, 2016, title of invention: smart speaker performing a function according to motion and a sound system using the same (SMART SPEAKER AND SOUND SYSTEM USING THE SAME)) Can be mentioned.

해당 특허문헌에서는, "해당 발명에 따른 모션에 따른 기능을 수행하는 스마트스피커는 다면체로 이루어진 하우징; 상기 하우징 내에 구비된 메인제어부; 상기 메인제어부에서 출력된 음향신호를 전달받아 음향을 방출하는 출력부; 및 상기 하우징의 모션상태 및 주변 환경을 감지하여 감지신호를 상기 메인제어부에 전달하는 센싱부;를 포함하여 이루어지며, 아울러 해당 발명에 따른 음향시스템은 상기 스마트스피커; 상기 스마트스피커와 데이터를 송수신 하는 외부입출력장치; 상기 모바일기기와 연동되어 데이터를 저장(save) 또는 로드(load)하는 관리서버; 및 상기 스마트스피커와 상기 모바일기기와 상기 관리서버 간의 상호 데이터 전송을 위한 신호전송수단;을 포함하여 이루어지는 기술"이 개시되어 있다. In the patent document, "a smart speaker performing a function according to a motion according to the present invention is a housing made of a polyhedron; a main control unit provided in the housing; an output unit that receives the sound signal output from the main control unit and emits sound. And a sensing unit that senses the motion state of the housing and the surrounding environment and transmits a detection signal to the main control unit, and the sound system according to the present invention includes the smart speaker; and transmits and receives data to and from the smart speaker. And a management server interlocking with the mobile device to store or load data; and a signal transmission means for mutual data transmission between the smart speaker and the mobile device and the management server; A technique made by doing so is disclosed.

하지만, 상술한 종래의 특허문헌에서 개시된 바와 같이, 종래 AI 스피커 제품들은 단순한 단어 인식을 통한 기능 구현을 할 뿐이다. 이와 같은 단순한 단어 인식을 통한 기능 구현을 넘어서서, 스마트 스피커와 사용자 사이에 교감이 가능한 인공지능 스마트 스피커에 대한 개시는 전혀 없었다. However, as disclosed in the above-described conventional patent document, conventional AI speaker products only implement functions through simple word recognition. Beyond the implementation of functions through simple word recognition, there has been no disclosure of an artificial intelligence smart speaker capable of communicating between a smart speaker and a user.

1. 대한민국 공개특허 제10-2019-0012708호(2019.02.11. 공개, 발명의 명칭 : 음성 인식 및 대화가 가능한 음식 주문용 스마트 스피커 및 이를 이용한 음식 주문 서비스 플랫폼(SMART SPEAKER FOR FOOD ORDER AND FOOD ORDER SERVICE PLATFORM USING THEREOF)1.Republic of Korea Patent Publication No. 10-2019-0012708 (published on February 11, 2019, title of invention: smart speaker for food ordering capable of voice recognition and conversation, and a food order service platform using the same (SMART SPEAKER FOR FOOD ORDER AND FOOD ORDER) SERVICE PLATFORM USING THEREOF) 2. 대한민국 특허 제10-1619274호(2016.05.02. 등록, 발명의 명칭 : 모션에 따른 기능을 수행하는 스마트스피커 및 이를 이용한 음향시스템(SMART SPEAKER AND SOUND SYSTEM USING THE SAME))2. Republic of Korea Patent No. 10-1619274 (registered on May 2, 2016, title of invention: Smart speaker that performs a function according to motion and a sound system using it (SMART SPEAKER AND SOUND SYSTEM USING THE SAME))

본 발명은 상술한 문제점을 해결하기 위하여 창출된 것으로, 본 발명은, 종래 AI 스피커 제품과 달리 단순한 단어 인식을 통한 기능 구현을 넘어서서, 스마트 스피커와 사용자 사이에 교감이 가능한 인공지능 스마트 스피커를 제공하는 것을 그 목적으로 하고 있다.The present invention was created to solve the above-described problems, and the present invention provides an artificial intelligence smart speaker capable of communicating between a smart speaker and a user, beyond realizing a function through simple word recognition, unlike conventional AI speaker products. It is for that purpose.

또한, 본 발명은, 종래 AI 스피커 제품과 달리 단순한 단어 인식을 통한 기능 구현을 넘어서서, 스마트 스피커와 사용자 사이에 교감이 가능한 인공지능 스마트 스피커로서, HD 음원을 재생하고, 사용 공간 크기 및 레이아웃, 데코, 스피커 위치 등에 따라 사운드가 자동 조절이 가능한 고음질 스피커 모듈; 모션 제어, 센서 인터페이스, LED 표현, 카메라 인식 및 무선통신 회로를 포함하는 모듈형 컨트롤러; 및 WiFi 또는 블루투스 통신모듈 응용 원격 장치 제어 모듈;를 포함하되, 인공지능 스마트 스피커의 머리부와 몸통부를 포함하고, 상기 머리부 및 몸통부 사이의 기구 구조로 인공지능 스마트 스피커의 감성 표현이 가능한, 감성 표현이 가능한 인공지능 스마트 스피커를 제공하는 것을 또 다른 목적으로 한다. In addition, the present invention, unlike conventional AI speaker products, as an artificial intelligence smart speaker capable of communicating between a smart speaker and a user, beyond the implementation of functions through simple word recognition, reproduces HD sound sources, and uses space size and layout, , A high-quality speaker module capable of automatically adjusting the sound according to the speaker position; Modular controller including motion control, sensor interface, LED expression, camera recognition and wireless communication circuit; And WiFi or Bluetooth communication module application remote device control module; including, but including the head and the body of the artificial intelligence smart speaker, capable of expressing the emotion of the artificial intelligence smart speaker with a mechanism structure between the head and the body, Another purpose is to provide an artificial intelligence smart speaker capable of expressing emotions.

상기의 목적을 달성하기 위한 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 인공지능 스마트 스피커는, 다양한 상용 영상 이미지 데이터베이스와 사용자에 맞춘 학습형 영상/이미지데이터베이스를 병행으로 하여 학습용 이미지 데이터베이스를 구축하고, 영상/이미지를 학습된 데이터베이스를 통해 감정 인식 및 이미지 인식에 대한 QA 데이터베이스에서 답을 얻어 표현 기법에 사용하고, 음성 인식 시스템을 통해 인식된 음성에 대해 음성 질의 분류기에서 질문과 명령을 분류한 후, 명령에 대해서는 대응하는 행동을 취하고 질문에 대해서는 영상/이미지와 같은 기 구축된 QA 데이터베이스를 통해 대응하는 질문에 답변하고, 상용의 인식 시스템과 연동하여 최적의 해법을 도출한다. The artificial intelligence smart speaker capable of interacting with a user according to an embodiment of the present invention to achieve the above object constructs an image database for learning by paralleling various commercial image database and learning type image/image database tailored to the user. In addition, the image/image is obtained from the QA database for emotion recognition and image recognition through the learned database and used for expression techniques, and the voice query classifier classifies the questions and commands for the voice recognized through the voice recognition system. Afterwards, it takes a corresponding action for the command, answers the corresponding question through a pre-built QA database such as video/image, and derives an optimal solution by linking with a commercial recognition system.

한편, 본 발명의 또 다른 실시예에 따른 감성 표현이 가능한 인공지능 스마트 스피커는, HD 음원을 재생하고, 사용 공간 크기 및 레이아웃, 데코, 스피커 위치 등에 따라 사운드가 자동 조절이 가능한 고음질 스피커 모듈; 모션 제어, 센서 인터페이스, LED 표현, 카메라 인식 및 무선통신 회로를 포함하는 모듈형 컨트롤러; 및 WiFi 또는 블루투스 통신모듈 응용 원격 장치 제어 모듈;를 포함하되, 인공지능 스마트 스피커의 머리부와 몸통부를 포함하고, 상기 머리부 및 몸통부 사이의 기구 구조로 인공지능 스마트 스피커의 감성 표현이 가능하다. On the other hand, the artificial intelligence smart speaker capable of expressing emotion according to another embodiment of the present invention includes a high-quality speaker module capable of playing an HD sound source and automatically adjusting a sound according to a space size and layout, decor, speaker position, etc.; Modular controller including motion control, sensor interface, LED expression, camera recognition and wireless communication circuit; And WiFi or Bluetooth communication module application remote device control module; including, but including the head and the body of the artificial intelligence smart speaker, it is possible to express the emotion of the artificial intelligence smart speaker with a mechanism structure between the head and the body. .

본 발명에 따르면, According to the present invention,

첫째, 종래 AI 스피커 제품과 달리 단순한 단어 인식을 통한 기능 구현을 넘어서서, 스마트 스피커와 사용자 사이에 교감이 가능한 인공지능 스마트 스피커를 제공할 수 있다. First, unlike conventional AI speaker products, it is possible to provide an artificial intelligence smart speaker capable of communicating between a smart speaker and a user beyond the implementation of functions through simple word recognition.

둘째, HD 음원을 재생하고, 사용 공간 크기 및 레이아웃, 데코, 스피커 위치 등에 따라 사운드가 자동 조절이 가능한 고음질 스피커 모듈; 모션 제어, 센서 인터페이스, LED 표현, 카메라 인식 및 무선통신 회로를 포함하는 모듈형 컨트롤러; 및 WiFi 또는 블루투스 통신모듈 응용 원격 장치 제어 모듈;를 포함하되, 인공지능 스마트 스피커의 머리부와 몸통부를 포함하고, 상기 머리부 및 몸통부 사이의 기구 구조로 인공지능 스마트 스피커의 감성 표현이 가능한, 사용자와 교감이 가능한 인공지능 스마트 스피커를 제공하는 것이 가능하다.Second, a high-quality speaker module capable of playing an HD sound source and automatically adjusting the sound according to the size and layout of the used space, decor, and speaker location; Modular controller including motion control, sensor interface, LED expression, camera recognition and wireless communication circuit; And WiFi or Bluetooth communication module application remote device control module; including, but including the head and the body of the artificial intelligence smart speaker, capable of expressing the emotion of the artificial intelligence smart speaker with a mechanism structure between the head and the body, It is possible to provide an artificial intelligence smart speaker that can communicate with users.

도 1은 종래의 스마트 스피커의 대표적인 예인 아마존 에코를 나타낸 것이다.
도 2는 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커의 모듈형 컨트롤러의 일례이다.
도 3은 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커에서 UA 상태 변이 다이어그램을 나타낸 것이다.
도 4는 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커에서 음성 및 영상 인식에 대한 다이어그램이다.
도 5는 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커에서 감성로봇의 감성표현 생성엔진의 블록도이다.
도 6은 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커에서 학습 및 놀이를 위한 감성형 인공지능 스마트 스피커 서비스 시나리오의 일례를 나타낸 것이다.
도 7은 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커에서 사용될 수 있는 어플리케이션의 일례를 나타낸 것이다. 1 shows an Amazon Echo that is a representative example of a conventional smart speaker.
2 is an example of a modular controller of an emotional artificial intelligence smart speaker capable of communicating with a user according to an embodiment of the present invention.
3 is a diagram illustrating a transition diagram of a UA state in an emotional AI smart speaker capable of communicating with a user according to an embodiment of the present invention.
4 is a diagram for voice and image recognition in an emotional AI smart speaker capable of communicating with a user according to an embodiment of the present invention.
5 is a block diagram of an emotional expression generating engine of an emotional robot in an emotional artificial intelligence smart speaker capable of communicating with a user according to an embodiment of the present invention.
FIG. 6 shows an example of an emotional artificial intelligence smart speaker service scenario for learning and playing in an emotional artificial intelligence smart speaker capable of communicating with a user according to an embodiment of the present invention.
7 illustrates an example of an application that can be used in an emotional artificial intelligence smart speaker capable of communicating with a user according to an embodiment of the present invention.

이하 첨부된 도면을 참조하면서 본 발명에 따른 바람직한 실시예를 상세히 설명하기로 한다. 이에 앞서, 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념을 적절하게 정의할 수 있다는 원칙에 입각하여, 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Prior to this, terms or words used in the specification and claims should not be construed as being limited to their usual or dictionary meanings, and the inventors appropriately explain the concept of terms in order to explain their own invention in the best way. Based on the principle that it can be defined, it should be interpreted as a meaning and concept consistent with the technical idea of the present invention.

따라서, 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시예에 불과할 뿐이고 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형예들이 있을 수 있음을 이해하여야 한다.Accordingly, the embodiments described in the present specification and the configurations shown in the drawings are only the most preferred embodiment of the present invention, and do not represent all the technical spirit of the present invention, and thus various alternatives that can be substituted for them at the time of application It should be understood that there may be equivalents and variations.

최근 디지털 신기술에서는, 자동차, 소재, IT, 바이오 산업 등 다양한 영역에서 기술과 제품의 융합 제품과 서비스의 융합이 확산되고 있다. 특히, 빅데이터, 인공지능, 사물인터넷, 클라우드, 3D 프린팅, 사이버물리시스템(Cyber Physical system) 등은 제조업의 발전을 한층 촉진시키고 있다. In recent digital new technologies, the convergence of technology and products is spreading in various fields such as automobiles, materials, IT, and bio industries. In particular, big data, artificial intelligence, IoT, cloud, 3D printing, and cyber physical systems are further promoting the development of the manufacturing industry.

이와 같은 디지털 신기술 시대에, 소비자의 소비 스타일도 급격히 변화함에 따라 기업도 이에 대응하는 다양한 제품, 서비스 및 기술을 선보이고 있다. 그 대표적인 예로서, 점점 활용할 여가 시간이 부족하다고 여기는 바쁜 현대인들을 위해 손 하나 움직이지 않고 목소리만으로도 전화 연결, 인터넷 검색, 물품 구매, 일정 잡기, 집 온도 관리, 조명 조절 등 여러 가지 명령을 내릴 수 있는 아마존 에코(Echo), 구글 홈(Home)과 같은 스마트 스피커가 출현하고 있다. In this era of new digital technologies, as consumers' consumption styles change rapidly, companies are also introducing various products, services, and technologies corresponding to them. As a representative example, for busy modern people who believe that their leisure time is increasingly lacking, they can issue various commands such as telephone connection, Internet search, purchase of goods, scheduling, house temperature management, lighting control, etc. Smart speakers such as Amazon Echo and Google Home are emerging.

이와 같은 스마트 스피커(smart speaker)는 인공지능(AI) 소프트웨어(스마트 어시스턴트, smart assistant)를 탑재한, 콤팩트한 디자인의 스피커 제품으로 음성이나 간단한 움직임으로 제어할 수 있는 제품이다. 다양한 스마트 기기들과 연동해 가정이나 사무실에서 사물인터넷(IoT, Internet of Things)의 허브로서 사용되고 있다. 이와 같은 시장 및 기술 동향 속에서 본 출원인은 사용자와 스마트 스피커 사이에 교감이 가능한 감성형 인공지능 스마트 스피커에 대한 발명을 하였다.Such a smart speaker is a compact design speaker product equipped with artificial intelligence (AI) software (smart assistant) that can be controlled by voice or simple movement. It is used as a hub of the Internet of Things (IoT) in homes and offices in conjunction with various smart devices. In the midst of such market and technology trends, the applicant invented an emotional artificial intelligence smart speaker capable of communicating between a user and a smart speaker.

본 발명에 따른 감성형 인공지능 스마트 스피커는, 스마트 디바이스의 단말로 사용될 수 있는 감성 반응형 스마트 스피커 제품이며, 스마트 디바이스가 접근할 수 있는 다양한 정보를 기반으로 상황에 맞는 감성을 표현한다. 사용자의 감성에 따라 음악을 선곡하고 음악의 리듬에 맞추어 춤추는 기능을 제공하며, 사물인터넷이나 웨어러블 등의 다양한 최신 기술을 이용하여 확장된 서비스를 제공한다. 로봇의 동작기능을 적용하되 로봇의 복잡한 하드웨어 장치와 센서부를 최소화하여 부품단가는 낮추고 스마트 기기의 각종 센서와 앱 등 스마트 리소스를 최대한 활용하는 것이 가능하다. The emotional artificial intelligence smart speaker according to the present invention is an emotional responsive smart speaker product that can be used as a terminal of a smart device, and expresses a sensibility suitable for a situation based on various information accessible by the smart device. It provides a function to select music according to the user's sensibility and dance according to the rhythm of the music, and provides an expanded service using various latest technologies such as the Internet of Things and wearables. It is possible to apply the robot's operation function, but minimize the robot's complex hardware devices and sensor parts, lower the cost of parts, and make the most of smart resources such as various sensors and apps of smart devices.

또한, 사용자 시나리오에 기반한 감성형 인공지능 스마트 스피커 제품으로 디자인한다. 감성적 친밀감을 유도할 시각적으로 매력적인 제품 디자인으로 설계하며, 인간의 감성에 영향을 미치는 시각과 청각을 자극하는 컬러와 음악을 가장 잘 표현할 수 있는 모던한 느낌의 오브제 타입의 외형 디자인를 갖는다. 팔과 다리가 달린 휴머노이드 로봇 형태를 과감하게 탈피한 형태를 취하고 있다. 기존의 감성로봇제품의 눈과 입 등을 이용하여 사람의 표정을 흉내내는 감정 표현을 함으로써 기계적인 부자연스러움과 거부감을 일으키는 외형적 결함을 보완한다. 기존 로봇들이 가지는 동작 위주의 투박한 디자인에서 벗어나 인간의 기본 감성을 기본으로 한 사용자 중심의 끌림이 있는 감성 제품이다. In addition, it is designed as an emotional artificial intelligence smart speaker product based on user scenarios. It is designed as a visually attractive product design that will induce emotional intimacy, and has a modern object-type external design that can best express colors and music that stimulates the sight and hearing that affect human emotions. It takes the form of a bold departure from the form of a humanoid robot with arms and legs. By expressing emotions that imitate human facial expressions using the eyes and mouths of existing emotional robot products, external defects that cause mechanical unnaturalness and rejection are complemented. It is an emotional product with a user-centered attraction based on the basic human sensibility, away from the clunky design oriented to the motion of existing robots.

본 발명에 따른 감성형 인공지능 스마트 스피커는, 고음질 스피커 모듈을 갖는다. HD 음원을 재생하고, 사용 공간 크기 및 레이아웃, 데코, 스피커 위치 등에 따라 사운드가 자동 조절이 되는 것이 가능하다. 또한, 감성형 인공지능 스마트 스피커는, 모션 제어, 센서 인터페이스, LED 표현, 카메라 인식, 무선통신 회로 등을 포함하는 모듈형 컨트롤러를 갖는다. 도 2는 본 발명에 따른 감성형 인공지능 스마트 스키커의 모듈형 컨트롤러의 일례이다. Sensitive artificial intelligence smart speaker according to the present invention has a high-quality speaker module. It is possible to play HD sound sources, and to automatically adjust the sound according to the space size and layout, decor, and speaker location. In addition, the emotional artificial intelligence smart speaker has a modular controller including motion control, sensor interface, LED expression, camera recognition, wireless communication circuit, and the like. 2 is an example of a modular controller of an emotional artificial intelligence smart skiker according to the present invention.

다음으로, 감성형 인공지능 스마트 스피커는, WiFi 또는 블루투스 통신모듈 응용 원격 장치 제어 모듈을 갖는다. 도 3은 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커에서 UA 상태 변이 다이어그램을 나타낸 것이다. Next, the emotional artificial intelligence smart speaker has a remote device control module applied to a WiFi or Bluetooth communication module. 3 is a diagram illustrating a transition diagram of a UA state in an emotional AI smart speaker capable of communicating with a user according to an embodiment of the present invention.

다음으로, 본 발명에 따른 감성형 인공지능 스마트 스피커는, 사용자의 상황 및 감정을 인지하고 감정표현을 생성하는 엔진을 갖는다. 도 4는 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커에서 음성 및 영상 인식에 대한 다이어그램이고, 도 5는 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커에서 감성로봇의 감성표현 생성엔진의 블록도이다. Next, the emotional artificial intelligence smart speaker according to the present invention has an engine that recognizes the user's situation and emotion and generates an emotion expression. 4 is a diagram of voice and image recognition in an emotional artificial intelligence smart speaker capable of communicating with a user according to an embodiment of the present invention, and FIG. 5 is an emotional type capable of communicating with a user according to an embodiment of the present invention. It is a block diagram of the emotional expression generation engine of an emotional robot in an artificial intelligence smart speaker.

도 4 및 도 5에 도시된 바와 같이, 다양한 상용 영상 이미지 데이터베이스와 사용자에 맞춘 학습형 영상/이미지데이터베이스를 병행으로 하여 학습용 이미지 데이터베이스를 구축하고, 영상/이미지를 학습된 데이터베이스를 통해 감정 인식 및 이미지 인식에 대한 QA 데이터베이스에서 답을 얻어 표현 기법에 사용하고, 음성 인식 시스템을 통해 인식된 음성에 대해 음성 질의 분류기에서 질문과 명령을 분류한 후, 명령에 대해서는 대응하는 행동을 취하고 질문에 대해서는 영상/이미지와 같은 기 구축된 QA 데이터베이스를 통해 대응하는 질문에 답변하고, 다양한 상용의 인식 시스템(예를 들어, 구글, 네이버 등)과 연동하여 최적의 해법을 도출하게 된다. 최적의 해법을 도출하기 위하여 다양한 시험 평가를 진행하게 된다. As shown in Figs. 4 and 5, a training image database is constructed using various commercial image image databases and a learning type image/image database tailored to the user in parallel, and emotion recognition and images through the learned database for images/images. The answer is obtained from the QA database for recognition and used for expression techniques.After classifying questions and commands in the voice query classifier for the voice recognized through the voice recognition system, it takes a corresponding action for the command, and takes a video/video for the question. The corresponding questions are answered through a pre-built QA database such as images, and the optimal solution is derived by linking with various commercial recognition systems (eg, Google, Naver, etc.). Various test evaluations are conducted to derive the optimal solution.

본 발명에서는, 단순하면서도 상호교감이 되는 감성형 인공지능 스마트 스피커 콘텐츠를 구축한다. 감성 반응을 하는 반려 디바이스 개념을 기본으로 다양한 앱을 통해 스마트기능을 활용한 콘텐츠를 포함한다. 사용자 중심의 서비스 시나리오를 기반으로 감성기능을 실행하고, 사용자의 기분 또는 음악의 장르에 반응하여 춤을 추거나 컬러를 변경하며, 스마트 앱(app)을 통한 콘텐츠 실행으로 사용자 누구나 새로운 스마트 기능을 추가하여 서비스 콘텐츠를 만들 수 있는 환경을 제공한다. In the present invention, a simple yet mutually communicative emotional artificial intelligence smart speaker content is constructed. It includes content using smart functions through various apps based on the concept of a companion device that responds to emotion. Execute emotional functions based on user-centered service scenarios, dance or change colors in response to the user's mood or genre of music, and add new smart functions to any user through content execution through a smart app It provides an environment in which service contents can be created.

또한, 인공지능 스마트 스피커 메커니즘을 포함한다. 즉, 감성적 긴밀감을 유도할 제품 디자인과 제품 성능을 고려한 맞춤형 메키니즘을 갖는다. 또한, 학습 및 놀이를 위한 감성형 인공지능 스마트 스피커 서비스 시나리오를 갖는다. 도 6은 그 일례를 도시한 것이다. It also includes an artificial intelligence smart speaker mechanism. In other words, it has a customized mechanism that considers product design and product performance that will induce emotional closeness. In addition, it has an emotional artificial intelligence smart speaker service scenario for learning and playing. 6 shows an example.

도 7은 본 발명의 일 실시예에 따른 사용자와 교감이 가능한 감성형 인공지능 스마트 스피커에서 사용될 수 있는 어플리케이션의 일례를 나타낸 것이다. 조명과 동작 등과 연계하여 감성표현이 가능한 콘텐츠를 포함한다. 7 illustrates an example of an application that can be used in an emotional artificial intelligence smart speaker capable of communicating with a user according to an embodiment of the present invention. Includes content that can express emotion in connection with lighting and motion.

이상과 같이, 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 이것에 의해 한정되지 않으며 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 본 발명의 기술 사상과 아래에 기재될 청구범위의 균등 범위 내에서 다양한 수정 및 변형이 가능함은 물론이다.As described above, although the present invention has been described by limited embodiments and drawings, the present invention is not limited thereto, and the technical spirit and the following by those of ordinary skill in the art to which the present invention pertains. It goes without saying that various modifications and variations are possible within the equal range of the claims to be described.

Claims

A variety of commercial video image databases and learning-type video/image databases tailored to users are used in parallel to build an image database for learning.
Images/images are obtained from the QA database for emotion recognition and image recognition through the learned database and used for expression techniques,
After classifying the questions and commands in the voice query classifier for the voice recognized through the voice recognition system, the corresponding action is taken for the command, and the corresponding question is answered through a previously established QA database such as video/image. and,
Interlocking with a commercial recognition system to derive an optimal solution,
Artificial intelligence smart speaker that can communicate with users.