KR20210014053A

KR20210014053A - System and method for registering device for voice assistant service

Info

Publication number: KR20210014053A
Application number: KR1020190125679A
Authority: KR
Inventors: 이호정; 고현목; 오형래; 황인철
Original assignee: 삼성전자주식회사
Priority date: 2019-07-29
Filing date: 2019-10-10
Publication date: 2021-02-08

Abstract

Provided are a system and method for registering a device for a voice assistant service. The method for registering a novel device for a voice assistant service by a server comprises the operations of: comparing functions of a previously registered device and functions of the novel device; identifying functions corresponding to the functions of the previously registered device among the functions of the novel device based on the comparison result; obtaining previously registered utterance data related to at least a part of the identified functions; and generating action data for the novel device based on the identified functions and the obtained previously registered utterance data.

Description

System and method for registering devices for voice assistant service {SYSTEM AND METHOD FOR REGISTERING DEVICE FOR VOICE ASSISTANT SERVICE}

본 개시는 보이스 어시스턴트 서비스를 위한 신규 디바이스를 등록하는 시스템 및 방법에 관한 것이다.The present disclosure relates to a system and method for registering a new device for voice assistant service.

멀티 미디어 기술 및 네트워크 기술이 발전함에 따라, 사용자는 디바이스를 이용하여 다양한 서비스를 제공받을 수 있게 되었다. 특히, 음성 인식 기술이 발전함에 따라, 사용자는 디바이스에 음성(예를 들어, 발화)을 입력하고, 서비스 제공 에이전트를 통해 음성 입력에 따른 응답 메시지를 수신할 수 있게 되었다. With the development of multimedia technology and network technology, users can receive various services using devices. In particular, with the development of speech recognition technology, a user can input a voice (for example, speech) to a device and receive a response message according to the voice input through a service providing agent.

보이스 어시스턴트 서비스에서 사용자의 음성 입력에 포함된 의도를 파악할 때, 인공 지능(Artificial Intelligence, AI) 기술이 활용될 수 있으며, 룰(Rule) 기반의 자연어 이해 기술(Natural Language Understanding, NLU)이 활용될 수도 있다.When determining the intention included in the user's voice input in the voice assistant service, artificial intelligence (AI) technology can be used, and rule-based natural language understanding technology (NLU) can be used. May be.

하지만, 보이스 어시스턴트 서비스를 제공함에 있어서 복수의 디바이스를 포함하는 홈 네트워크 환경에서 신규의 디바이스가 추가되는 경우에, 신규의 디바이스의 기능들을 고려하여 사용자의 음성 입력에 따른 디바이스 제어를 제공하기 어려운 문제점이 있었다. 특히, 신규의 디바이스가 보이스 어시스턴트 서비스에 미리 등록된 디바이스가 아닌 경우에도, 신규의 디바이스의 기능들을 보이스 어시스턴트 서비스에 효과적으로 반영할 필요가 있다.However, when a new device is added in a home network environment including a plurality of devices in providing the voice assistant service, it is difficult to provide device control according to the user's voice input in consideration of the functions of the new device. there was. In particular, even if the new device is not a device registered in advance in the voice assistant service, it is necessary to effectively reflect the functions of the new device in the voice assistant service.

본 개시의 일 실시예는, 보이스 어시스턴트 서비스를 위한 기등록된 디바이스의 기능들을 이용하여 신규 디바이스를 등록할 수 있는 시스템 및 방법을 제공할 수 있다.An embodiment of the present disclosure may provide a system and method capable of registering a new device using functions of a previously registered device for a voice assistant service.

또한, 본 개시의 일 실시예는, 적어도 하나의 기등록된 디바이스의 기능들을 조합 또는 삭제함으로써 신규 디바이스의 기능을 등록할 수 있는 시스템 및 방법을 제공할 수 있다.In addition, an embodiment of the present disclosure may provide a system and method capable of registering a function of a new device by combining or deleting functions of at least one previously registered device.

또한, 본 개시의 일 실시예는, 기등록된 디바이스의 기능들에 관련된 발화 데이터를 이용하여 신규 디바이스를 등록할 수 있는 시스템 및 방법을 제공할 수 있다.In addition, an embodiment of the present disclosure may provide a system and method for registering a new device by using speech data related to functions of a previously registered device.

또한, 본 개시의 일 실시예는, 기등록된 디바이스에 대한 발화 데이터 및 액션 데이터를 이용하여, 신규 디바이스의 기능들에 관련된 발화 데이터 및 액션 데이터를 획득할 수 있는 시스템 및 방법을 제공할 수 있다.In addition, an embodiment of the present disclosure may provide a system and method capable of obtaining utterance data and action data related to functions of a new device by using utterance data and action data for a previously registered device. .

또한, 본 개시의 일 실시예는, 신규 디바이스의 기능들에 관련된 발화 데이터 및 액션 데이터를 이용하여, 신규 디바이스에 특화된 보이스 어시스턴트 모델을 생성 또는 업데이트할 수 있는 시스템 및 방법을 제공할 수 있다.In addition, an embodiment of the present disclosure may provide a system and method capable of generating or updating a voice assistant model specialized for a new device by using speech data and action data related to functions of a new device.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 개시의 제1 측면은 적어도 하나의 기등록된 디바이스의 기능들을 나타내는 명세 정보(specification)를 획득하는 동작; 상기 명세 정보에 기초하여 상기 기등록된 디바이스의 기능들 및 상기 신규 디바이스의 기능들을 비교하는 동작; 상기 비교 결과에 기초하여, 상기 신규 디바이스의 기능들 중에서 상기 기 등록된 디바이스의 기능들에 대응되는 기능들을 식별하는 동작; 상기 식별된 기능들 중 적어도 일부에 관련된 기 등록된 발화 데이터를 획득하는 동작; 상기 식별된 기능들 및 상기 획득된 기등록된 발화 데이터에 기초하여, 상기 신규 디바이스에 대한 액션 데이터를 생성하는 동작; 및 상기 획득된 발화 데이터 및 상기 생성된 액션 데이터를 상기 신규 디바이스와 연관하여 저장하는 동작;을 포함하며, 상기 액션 데이터는, 상기 획득된 발화 데이터에 대응되는 상기 신규 디바이스의 일련의 세부 기능들에 관한 데이터를 포함하는 것인, 서버가 보이스 어시스턴트 서비스를 위한 신규 디바이스를 등록하는 방법을 제공할 수 있다.As a technical means for achieving the above technical problem, a first aspect of the present disclosure includes an operation of obtaining specification information indicating functions of at least one pre-registered device; Comparing functions of the previously registered device and functions of the new device based on the specification information; Identifying functions corresponding to functions of the previously registered device among functions of the new device based on the comparison result; Obtaining pre-registered speech data related to at least some of the identified functions; Generating action data for the new device based on the identified functions and the acquired pre-registered speech data; And storing the obtained speech data and the generated action data in association with the new device, wherein the action data is included in a series of detailed functions of the new device corresponding to the obtained speech data. The server may provide a method of registering a new device for voice assistant service, including data relating to it.

또한, 본 개시의 제2 측면은 통신 인터페이스; 하나 이상의 명령어들(instructions)을 포함하는 프로그램을 저장하는 메모리; 및 상기 메모리에 저장된 프로그램의 하나 이상의 명령어들을 실행하는 프로세서; 를 포함하고, 상기 프로세서는, 적어도 하나의 기등록된 디바이스의 기능들을 나타내는 명세 정보(specification)를 획득하고, 상기 명세 정보에 기초하여 상기 기등록된 디바이스의 기능들 및 상기 신규 디바이스의 기능들을 비교하고, 상기 비교 결과에 기초하여, 상기 신규 디바이스의 기능들 중에서 상기 기 등록된 디바이스의 기능들에 대응되는 기능들을 식별하고, 상기 식별된 기능들 중 적어도 일부에 관련된 기 등록된 발화 데이터를 획득하고, 상기 식별된 기능들 및 상기 획득된 기등록된 발화 데이터에 기초하여, 상기 신규 디바이스에 대한 액션 데이터를 생성하고, 상기 획득된 발화 데이터 및 상기 생성된 액션 데이터를 상기 신규 디바이스와 연관하여 소정의 DB(DataBase)에 저장하며, 상기 액션 데이터는, 상기 획득된 발화 데이터에 대응되는 상기 신규 디바이스의 일련의 세부 기능들에 관한 데이터를 포함하는 것인, 보이스 어시스턴트 서비스를 위한 신규 디바이스를 등록하는 서버를 제공할 수 있다.In addition, a second aspect of the present disclosure is a communication interface; A memory storing a program including one or more instructions; And a processor that executes one or more instructions of the program stored in the memory. Including, wherein the processor obtains specification information representing functions of at least one pre-registered device, and compares functions of the pre-registered device and functions of the new device based on the specification information And, based on the comparison result, identify functions corresponding to functions of the previously registered device among functions of the new device, and acquire pre-registered speech data related to at least some of the identified functions, and , On the basis of the identified functions and the acquired pre-registered speech data, action data for the new device is generated, and the obtained speech data and the generated action data are associated with the new device. Server for registering a new device for voice assistant service, which is stored in a database (DB), and the action data includes data on a series of detailed functions of the new device corresponding to the acquired speech data Can provide.

또한, 본 개시의 제3 측면은 제1 측면의 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공할 수 있다.In addition, a third aspect of the present disclosure may provide a computer-readable recording medium in which a program for executing the method of the first aspect on a computer is recorded.

도 1은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서비스를 제공하는 시스템의 개요도이다.
도 2는 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버가 기등록된 디바이스의 기능에 기초하여 신규 디바이스를 등록하는 예시를 나타내는 도면이다.
도 3은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버(3000)가 신규 디바이스를 등록하는 방법의 흐름도이다.
도 4는 본 개시의 일 실시예에 따른 서버가 기설정된 디바이스의 기능 및 신규 디바이스의 기능을 비교하는 방법의 흐름도이다.
도 5(a)는 본 개시의 일 실시예에 따른 기등록된 디바이스의 기능 및 신규 디바이스의 기능을 비교하는 예시를 나타내는 도면이다.
도 5(b)는 본 개시의 일 실시예에 따른 기등록된 디바이스의 기능 세트 및 신규 디바이스의 기능을 비교하는 예시를 나타내는 도면이다.
도 5(c)는 본 개시의 일 실시예에 따른 기등록된 디바이스의 기능 및 기능 세트의 조합을 신규 디바이스의 기능과 비교하는 예시를 나타내는 도면이다.
도 5(d)는 본 개시의 일 실시예에 따른 기등록된 복수의 디바이스의 기능 및 기능 세트의 조합을 신규 디바이스의 기능과 비교하는 예시를 나타내는 도면이다.
도 5(e)는 본 개시의 일 실시예에 따른 기등록된 디바이스의 기능들 중 일부를 삭제한 이후에 남은 기능을 신규 디바이스의 기능과 비교하는 예시를 나타내는 도면이다.
도 6은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버가 신규 디바이스의 기능들 중에서 기등록된 디바이스의 기능과 상이한 기능에 관련된 발화 데이터 및 액션 데이터를 생성하는 방법의 흐름도이다.
도 7(a)는 본 개시의 일 실시예에 따른 신규 디바이스의 기능에 관련된 발화 데이터 및 액션 데이터를 생성하기 위해 보이스 어시스턴트 서버로부터 출력되는 질의의 예시를 나타내는 도면이다.
도 7(b)는 본 개시의 일 실시예에 따른 신규 디바이스의 기능에 관련된 발화 데이터 및 액션 데이터를 생성하기 위해 발화문을 추천하는 질의가 출력되는 예시를 나타내는 도면이다.
도 8은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버(3000)가 발화 데이터를 확장하는 방법의 흐름도이다.
도 9(a)는 본 개시의 일 실시예에 따른 발화 데이터로부터 유사 발화 데이터가 생성되는 예시를 나타내는 도면이다.
도 9(b)는 본 개시의 일 실시예에 따른 대표 발화문 및 유사 발화문이 액션 데이터에 매칭되는 예시를 나타내는 도면이다.
도 10(a)는 본 개시의 일 실시예에 따른 발화 데이터로부터 유사 발화 데이터가 생성되는 예시를 나타내는 도면이다.
도 10(b)는 본 개시의 일 실시예에 따른 대표 발화문 및 유사 발화문이 액션 데이터에 매칭되는 예시를 나타내는 도면이다.
도 11(a)는 본 개시의 일 실시예에 따른 발화 데이터의 예시를 나타내는 도면이다.
도 11(b)는 본 개시의 일 실시예에 따른 발화 데이터의 예시를 나타내는 도면이다.
도 12는 본 개시의 일 실시예에 따른 디바이스의 명세 정보의 예시를 나타내는 도면이다.
도 13은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버의 블록도이다.
도 14는 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버의 다른 예시를 나타내는 도면이다.
도 15는 본 개시의 일 실시예에 따른 액션 플랜 관리 모델을 도시한 개념도이다.
도 16은 본 개시의 일 실시예에 따른 액션 플랜 관리 모델에 저장된 캡슐 데이터베이스를 도시한 도면이다.
도 17은 본 개시의 일 실시예에 따른 IoT 클라우드 서버의 블록도이다.
도 18은 본 개시의 일 실시예에 따른 클라이언트 디바이스의 블록도이다.1 is a schematic diagram of a system for providing a voice assistant service according to an embodiment of the present disclosure.
2 is a diagram illustrating an example in which a voice assistant server according to an embodiment of the present disclosure registers a new device based on a function of a previously registered device.
3 is a flowchart of a method for registering a new device by the voice assistant server 3000 according to an embodiment of the present disclosure.
4 is a flowchart illustrating a method of comparing, by a server, a function of a preset device and a function of a new device according to an embodiment of the present disclosure.
5A is a diagram illustrating an example of comparing a function of a previously registered device and a function of a new device according to an embodiment of the present disclosure.
5(b) is a diagram illustrating an example of comparing a function set of a previously registered device and a function of a new device according to an embodiment of the present disclosure.
5C is a diagram illustrating an example of comparing a combination of a function and a function set of a previously registered device with a function of a new device according to an embodiment of the present disclosure.
5(d) is a diagram illustrating an example of comparing a combination of a function and a function set of a plurality of previously registered devices with a function of a new device according to an embodiment of the present disclosure.
5(e) is a diagram illustrating an example of comparing a function of a new device with a function remaining after some of the functions of a previously registered device according to an embodiment of the present disclosure are deleted.
6 is a flowchart of a method for generating, by a voice assistant server, utterance data and action data related to a function different from a function of a previously registered device among functions of a new device according to an embodiment of the present disclosure.
7A is a diagram illustrating an example of a query output from a voice assistant server to generate speech data and action data related to a function of a new device according to an embodiment of the present disclosure.
7B is a diagram illustrating an example in which a query recommending a speech sentence is output to generate speech data and action data related to a function of a new device according to an embodiment of the present disclosure.
8 is a flowchart of a method for expanding speech data by the voice assistant server 3000 according to an embodiment of the present disclosure.
9A is a diagram illustrating an example in which similar speech data is generated from speech data according to an embodiment of the present disclosure.
9B is a diagram illustrating an example in which a representative speech and a similar speech are matched with action data according to an embodiment of the present disclosure.
10A is a diagram illustrating an example in which similar speech data is generated from speech data according to an embodiment of the present disclosure.
10B is a diagram illustrating an example in which a representative speech and a similar speech are matched with action data according to an embodiment of the present disclosure.
11A is a diagram illustrating an example of speech data according to an embodiment of the present disclosure.
11(b) is a diagram illustrating an example of speech data according to an embodiment of the present disclosure.
12 is a diagram illustrating an example of specification information of a device according to an embodiment of the present disclosure.
13 is a block diagram of a voice assistant server according to an embodiment of the present disclosure.
14 is a diagram illustrating another example of a voice assistant server according to an embodiment of the present disclosure.
15 is a conceptual diagram illustrating an action plan management model according to an embodiment of the present disclosure.
16 is a diagram illustrating a capsule database stored in an action plan management model according to an embodiment of the present disclosure.
17 is a block diagram of an IoT cloud server according to an embodiment of the present disclosure.
18 is a block diagram of a client device according to an embodiment of the present disclosure.

아래에서는 첨부한 도면을 참조하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 개시의 실시예를 상세히 설명한다. 그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 개시를 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present disclosure. However, the present disclosure may be implemented in various different forms and is not limited to the embodiments described herein. In addition, in the drawings, parts not related to the description are omitted in order to clearly describe the present disclosure, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part is said to be "connected" to another part, this includes not only "directly connected" but also "electrically connected" with another element interposed therebetween. . In addition, when a part "includes" a certain component, it means that other components may be further included rather than excluding other components unless specifically stated to the contrary.

본 개시에 따른 인공 지능과 관련된 기능은 프로세서와 메모리를 통해 동작된다. 프로세서는 하나 또는 복수의 프로세서로 구성될 수 있다. 이때, 하나 또는 복수의 프로세서는 CPU, AP, DSP(Digital Signal Processor) 등과 같은 범용 프로세서, GPU, VPU(Vision Processing Unit)와 같은 그래픽 전용 프로세서 또는 NPU와 같은 인공 지능 전용 프로세서일 수 있다. 하나 또는 복수의 프로세서는, 메모리에 저장된 기 정의된 동작 규칙 또는 인공 지능 모델에 따라, 입력 데이터를 처리하도록 제어한다. 또는, 하나 또는 복수의 프로세서가 인공 지능 전용 프로세서인 경우, 인공 지능 전용 프로세서는, 특정 인공 지능 모델의 처리에 특화된 하드웨어 구조로 설계될 수 있다. Functions related to artificial intelligence according to the present disclosure are operated through a processor and a memory. The processor may be composed of one or a plurality of processors. In this case, one or more processors may be a general-purpose processor such as a CPU, AP, or Digital Signal Processor (DSP), a graphics-only processor such as a GPU, a Vision Processing Unit (VPU), or an artificial intelligence-only processor such as an NPU. One or more processors control to process input data according to a predefined operation rule or an artificial intelligence model stored in the memory. Alternatively, when one or more processors are dedicated artificial intelligence processors, the artificial intelligence dedicated processor may be designed with a hardware structure specialized for processing a specific artificial intelligence model.

기 정의된 동작 규칙 또는 인공 지능 모델은 학습을 통해 만들어진 것을 특징으로 한다. 여기서, 학습을 통해 만들어진다는 것은, 기본 인공 지능 모델이 학습 알고리즘에 의하여 다수의 학습 데이터들을 이용하여 학습됨으로써, 원하는 특성(또는, 목적)을 수행하도록 설정된 기 정의된 동작 규칙 또는 인공 지능 모델이 만들어짐을 의미한다. 이러한 학습은 본 개시에 따른 인공 지능이 수행되는 기기 자체에서 이루어질 수도 있고, 별도의 서버 및/또는 시스템을 통해 이루어 질 수도 있다. 학습 알고리즘의 예로는, 지도형 학습(supervised learning), 비지도형 학습(unsupervised learning), 준지도형 학습(semi-supervised learning) 또는 강화 학습(reinforcement learning)이 있으나, 전술한 예에 한정되지 않는다.A predefined motion rule or an artificial intelligence model is characterized by being created through learning. Here, to be made through learning means that the basic artificial intelligence model is learned using a plurality of learning data by a learning algorithm, so that a predefined motion rule or artificial intelligence model set to perform a desired characteristic (or purpose) is created. Means Jim. Such learning may be performed in the device itself performing artificial intelligence according to the present disclosure, or may be performed through a separate server and/or system. Examples of the learning algorithm include supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning, but are not limited to the above-described examples.

인공 지능 모델은, 복수의 신경망 레이어들로 구성될 수 있다. 복수의 신경망 레이어들 각각은 복수의 가중치들(weight values)을 갖고 있으며, 이전(previous) 레이어의 연산 결과와 복수의 가중치들 간의 연산을 통해 신경망 연산을 수행한다. 복수의 신경망 레이어들이 갖고 있는 복수의 가중치들은 인공 지능 모델의 학습 결과에 의해 최적화될 수 있다. 예를 들어, 학습 과정 동안 인공 지능 모델에서 획득한 로스(loss) 값 또는 코스트(cost) 값이 감소 또는 최소화되도록 복수의 가중치들이 갱신될 수 있다. 인공 신경망은 심층 신경망(DNN:Deep Neural Network)를 포함할 수 있으며, 예를 들어, CNN (Convolutional Neural Network), DNN (Deep Neural Network), RNN (Recurrent Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 또는 심층 Q-네트워크 (Deep Q-Networks) 등이 있으나, 전술한 예에 한정되지 않는다.The artificial intelligence model may be composed of a plurality of neural network layers. Each of the plurality of neural network layers has a plurality of weight values, and a neural network operation is performed through an operation between the operation result of a previous layer and a plurality of weights. The plurality of weights of the plurality of neural network layers can be optimized by the learning result of the artificial intelligence model. For example, a plurality of weights may be updated to reduce or minimize a loss value or a cost value acquired from the artificial intelligence model during the learning process. The artificial neural network may include a deep neural network (DNN), for example, CNN (Convolutional Neural Network), DNN (Deep Neural Network), RNN (Recurrent Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network), BRDNN (Bidirectional Recurrent Deep Neural Network), or deep Q-Networks (Deep Q-Networks), and the like, but is not limited to the above-described example.

이하 첨부된 도면을 참고하여 본 개시를 상세히 설명하기로 한다.Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

도 1은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서비스를 제공하는 시스템의 개요도이다.1 is a schematic diagram of a system for providing a voice assistant service according to an embodiment of the present disclosure.

도 1을 참조하면, 본 개시의 일 실시예에 따른 보이스 어시스턴트 서비스를 제공하는 시스템은, 클라이언트 디바이스(1000), 적어도 하나의 디바이스(2000), 보이스 어시스턴트 서버(3000) 및 IoT(Internet Of Things) 클라우드 서버(4000)를 포함할 수 있다. 적어도 하나의 디바이스(2000)는 보이스 어시스턴트 서비스를 위하여 보이스 어시스턴트 서버(3000) 또는 IoT 클라우드 서버(4000)에 미리 등록된 디바이스일 수 있다.Referring to FIG. 1, a system for providing a voice assistant service according to an embodiment of the present disclosure includes a client device 1000, at least one device 2000, a voice assistant server 3000, and Internet Of Things (IoT). It may include a cloud server (4000). The at least one device 2000 may be a device previously registered in the voice assistant server 3000 or the IoT cloud server 4000 for the voice assistant service.

클라이언트 디바이스(1000)는 사용자로부터 음성 입력(예를 들어, 발화)을 수신할 수 있다. 일 실시예에서, 클라이언트 디바이스(1000)는 음성 인식 모듈을 포함할 수 있다. 일 실시예에서, 클라이언트 디바이스(1000)는 제한적인 기능을 갖는 음성 인식 모듈을 포함할 수 있다. 예를 들어, 클라이언트 디바이스(1000)는 지정된 음성 입력(예를 들어, ‘하이 빅스비’, ‘오케이 구글’등과 같은 웨이크 업 입력)을 감지하는 기능 또는 일부 음성 입력으로부터 획득한 음성 신호를 전처리하는 기능을 갖는 음성 인식 모듈을 포함할 수 있다. 클라이언트 디바이스(1000)는 인공 지능 스피커(AI speaker)일 수 있으나, 이에 한정되는 것은 아니다. 일 실시예에서, 적어도 하나의 디바이스(2000) 중 일부가 클라이언트 디바이스(1000)일 수 있다. The client device 1000 may receive a voice input (eg, speech) from a user. In one embodiment, the client device 1000 may include a voice recognition module. In one embodiment, the client device 1000 may include a voice recognition module having limited functionality. For example, the client device 1000 has a function of detecting a designated voice input (for example, a wake-up input such as'Hi Bixby','Okay Google', etc.) or pre-processing a voice signal obtained from some voice inputs. It may include a speech recognition module having a function. The client device 1000 may be an artificial intelligence speaker, but is not limited thereto. In an embodiment, some of the at least one device 2000 may be the client device 1000.

적어도 하나의 디바이스(2000)는 보이스 어시스턴트 서버(3000) 및/또는 IoT 클라우드 서버(4000)로부터의 제어 명령에 따라 특정 동작을 수행하는 타겟 디바이스일 수 있다. 적어도 하나의 디바이스(2000)는, 클라이언트 디바이스(1000)가 수신한 사용자의 음성 입력에 기초하여, 특정 동작을 수행하도록 제어될 수 있다. 일 실시예에서, 적어도 하나의 디바이스(2000) 중 적어도 일부는, 보이스 어시스턴트 서버(3000) 및/또는 IoT 클라우드 서버(4000)로부터 제어 명령을 받지 않고, 클라이언트 디바이스(1000)로부터 제어 명령을 수신할 수도 있다.The at least one device 2000 may be a target device that performs a specific operation according to a control command from the voice assistant server 3000 and/or the IoT cloud server 4000. The at least one device 2000 may be controlled to perform a specific operation based on a user's voice input received by the client device 1000. In one embodiment, at least some of the at least one device 2000 may receive a control command from the client device 1000 without receiving a control command from the voice assistant server 3000 and/or the IoT cloud server 4000. May be.

클라이언트 디바이스(1000)는 마이크를 통해 사용자의 음성 입력을 수신하고, 수신된 음성 입력을 보이스 어시스턴트 서버(3000)에 전송할 수 있다. 일 실시예에서, 클라이언트 디바이스(1000)는 수신된 음성 입력으로부터 음성 신호를 획득하고, 음성 신호를 보이스 어시스턴트 서버(3000)에게 전송할 수 있다.The client device 1000 may receive a user's voice input through a microphone and transmit the received voice input to the voice assistant server 3000. In an embodiment, the client device 1000 may obtain a voice signal from the received voice input and transmit the voice signal to the voice assistant server 3000.

보이스 어시스턴트 서버(3000)는 클라이언트 디바이스(1000)로부터 사용자의 음성 입력을 수신하고, 수신된 음성 입력을 해석함으로써, 적어도 하나의 디바이스(2000) 중에서 사용자의 의도에 따른 동작들을 수행할 타겟 디바이스를 선택하고, 선택된 타겟 디바이스 및 타겟 디바이스가 수행할 동작들에 관한 정보를 IoT 클라우드 서버(4000) 또는 타겟 디바이스에게 제공할 수 있다.The voice assistant server 3000 receives a user's voice input from the client device 1000 and analyzes the received voice input to select a target device from among at least one device 2000 to perform actions according to the user's intention. In addition, information on the selected target device and operations to be performed by the target device may be provided to the IoT cloud server 4000 or the target device.

IoT 클라우드 서버(4000)는 보이스 어시스턴트 서비스를 위한 디바이스(2000)에 관한 정보를 등록하고 관리할 수 있으며, 보이스 어시스턴트 서버(3000)에게 보이스 어시스턴트 서비스를 위한 디바이스 정보를 제공할 수 있다. 디바이스 정보는, 보이스 어시스턴트 서비스를 제공하는데 이용되는 디바이스에 관련된 정보로서, 예를 들어, 디바이스의 식별 정보(디바이스 id 정보), 기능 수행 능력 정보(capability), 위치 정보, 및 상태 정보를 포함할 수 있다. 또한, IoT 클라우드 서버(4000)는 보이스 어시스턴트 서버(3000)로부터 타겟 디바이스 및 타겟 디바이스가 수행할 동작들에 관한 정보를 수신하고, 타겟 디바이스에게 동작들의 제어를 위한 제어 정보를 제공할 수 있다.The IoT cloud server 4000 may register and manage information on the device 2000 for the voice assistant service, and may provide device information for the voice assistant service to the voice assistant server 3000. Device information is information related to a device used to provide a voice assistant service, and may include, for example, identification information (device id information), function performance information (capability), location information, and status information of the device. have. In addition, the IoT cloud server 4000 may receive information on a target device and operations to be performed by the target device from the voice assistant server 3000 and provide control information for controlling the operations to the target device.

보이스 어시스턴트 서비스를 위한 신규 디바이스(2900)가 추가되는 경우에, 보이스 어시스턴트 서버(3000)는, 기등록된 디바이스(2000)의 기능, 발화 데이터, 및 발화 데이터에 대응되는 동작들을 활용하여 신규 디바이스(2900)에 관한 발화 데이터 및 액션 데이터를 생성할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)에 관한 발화 데이터 및 액션 데이터를 이용하여, 신규 디바이스(2900)에 관하여 이용될 보이스 어시스턴트 모델을 생성 또는 업데이트할 수 있다. When a new device 2900 for the voice assistant service is added, the voice assistant server 3000 utilizes the functions of the previously registered device 2000, speech data, and operations corresponding to the speech data. 2900) may generate speech data and action data. In addition, the voice assistant server 3000 may generate or update a voice assistant model to be used with respect to the new device 2900 by using speech data and action data about the new device 2900.

발화 데이터는, 보이스 어시스턴트 서비스를 제공받기 위하여 사용자가 발화하는 음성에 관련된 데이터로서, 사용자의 발화를 나타내는 데이터일 수 있다. 발화 데이터는 디바이스(2000)의 동작과 관련된 사용자의 의도를 해석하는데 이용되는 데이터일 수 있다. 발화 데이터는, 예를 들어, 텍스트 형식의 발화문 또는 NLU 모델의 출력 값의 형식을 가지는 발화 파라미터 중 적어도 하나를 포함할 수 있다. 발화 파라미터는, NLU 모델로부터 출력되는 데이터로서, 인텐트 및 파라미터를 포함할 수 있다. 인텐트는 NLU 모델을 이용하여 텍스트를 해석함으로써 결정되는 정보로서, 사용자의 발화 의도를 나타낼 수 있다. 인텐트는, 예를 들어, 사용자가 의도하는 디바이스의 동작을 나타내는 정보일 수 있다. 인텐트는, 사용자의 발화 의도를 나타내는 정보(이하, 의도 정보)뿐 아니라, 사용자의 의도를 나타내는 정보에 대응하는 수치 값을 포함할 수 있다. 수치 값은, 텍스트가 특정 의도를 나타내는 정보와 관련될 확률을 나타낼 수 있다. NLU 모델을 이용하여 텍스트를 해석한 결과, 사용자의 의도를 나타내는 정보가 복수 개 획득되는 경우, 각 의도 정보에 대응되는 수치 값이 최대인 의도 정보가 인텐트로 결정될 수 있다. 또한, 파라미터는 인텐트와 관련된 디바이스의 세부 동작들을 결정하기 위한 변수(variable) 정보일 수 있다. 파라미터는 인텐트와 관련된 정보이며, 하나의 인텐트에 복수 종류의 파라미터가 대응될 수 있다. 파라미터는 디바이스의 동작 정보를 결정하기 위한 변수 정보뿐만 아니라, 텍스트가 그 변수 정보와 관련될 확률을 나타내는 수치 값을 포함할 수 있다. 자연어 이해 모델을 이용하여 텍스트를 해석한 결과, 파라미터를 나타내는 변수 정보가 복수 개 획득될 수 있다. 이 경우, 각 변수 정보에 대응되는 수치 값이 최대인 변수 정보가 파라미터로 결정될 수 있다.The speech data is data related to the voice spoken by the user in order to receive the voice assistant service, and may be data representing the speech of the user. The speech data may be data used to interpret the user's intention related to the operation of the device 2000. The speech data may include, for example, at least one of speech parameters in the form of a speech sentence in a text format or an output value of the NLU model. The utterance parameter is data output from the NLU model and may include an intent and a parameter. The intent is information determined by interpreting the text using the NLU model, and may indicate the user's speech intention. The intent may be, for example, information indicating an operation of a device intended by the user. The intent may include not only information indicating the user's utterance intention (hereinafter, intention information), but also a numerical value corresponding to information indicating the user's intention. Numerical values may indicate the probability that the text will be associated with information indicating a specific intent. When a plurality of pieces of information indicating the user's intention are obtained as a result of analyzing the text using the NLU model, intention information having a maximum numerical value corresponding to each intention information may be determined as the intent. Also, the parameter may be variable information for determining detailed operations of the device related to the intent. The parameter is information related to the intent, and a plurality of types of parameters may correspond to one intent. The parameter may include not only variable information for determining device operation information, but also a numerical value indicating a probability that the text is related to the variable information. As a result of analyzing the text using the natural language understanding model, a plurality of variable information indicating a parameter may be obtained. In this case, variable information having a maximum numerical value corresponding to each variable information may be determined as a parameter.

액션 데이터는, 소정의 발화 데이터에 대응되는 디바이스(2000)의 일련의 세부 동작들에 관한 데이터일 수 있다. 예를 들어, 액션 데이터는, 소정의 발화 데이터에 대응하여 디바이스가 수행할 세부 동작들, 각 세부 동작들과 다른 세부 동작과의 연관 관계, 및 세부 동작들의 실행 순서에 관련된 정보를 포함할 수 있다. 세부 동작과 다른 세부 동작과의 연관 관계는, 하나의 세부 동작을 실행하기 위해서 그 세부 동작을 실행하기 전에 실행되어야 할 다른 세부 동작에 대한 정보를 포함한다. 예를 들어, 수행할 동작이 “음악 재생”인 경우, “전원 온(on)”은 “음악 재생” 동작 이전에 실행되어야 하는 다른 세부 동작이 될 수 있다. 또한, 액션 데이터는 예를 들어, 특정 동작의 수행을 위하여 타겟 디바이스가 실행해야 할 기능들, 기능들의 실행 순서, 기능들을 실행하기 위하여 필요한 입력 값 및 기능들의 실행 결과로서 출력되는 출력 값을 포함할 수 있으나, 이에 한정되지 않는다.The action data may be data regarding a series of detailed operations of the device 2000 corresponding to predetermined speech data. For example, the action data may include detailed operations to be performed by the device in response to predetermined speech data, a relationship between each detailed operation and other detailed operations, and information related to an execution order of the detailed operations. . The association relationship between a detailed operation and another detailed operation includes information about another detailed operation to be executed before executing the detailed operation in order to execute one detailed operation. For example, when the operation to be performed is “music playback”, “power on” may be another detailed operation that must be executed before the “music playback” operation. In addition, the action data may include, for example, functions to be executed by the target device in order to perform a specific operation, an execution order of functions, an input value required to execute the functions, and an output value output as a result of execution of the functions. However, it is not limited thereto.

또한, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)가 식별되면, 신규 디바이스(2900)의 기능 정보를 획득하고, 기 등록된 디바이스(2000)의 기능 정보와 신규 디바이스(2900)의 기능 정보를 비교함으로써, 신규 디바이스(2900)의 기능과 관련하여 활용 가능한 기 등록된 발화 데이터를 결정할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는, 기 등록된 발화 데이터 및 이에 대응되는 기능들을 편집하고, 편집된 발화 데이터 및 기능들에 관한 데이터를 이용하여, 액션 데이터를 생성할 수 있다.In addition, when the new device 2900 is identified, the voice assistant server 3000 acquires function information of the new device 2900, and stores function information of the previously registered device 2000 and the function information of the new device 2900. By comparing, it is possible to determine pre-registered speech data usable in relation to the function of the new device 2900. In addition, the voice assistant server 3000 may edit pre-registered speech data and functions corresponding thereto, and generate action data using the edited speech data and data related to the functions.

디바이스(2000)는, 스마트폰, 태블릿 PC, PC, 스마트 TV, 휴대폰, PDA(personal digital assistant), 랩톱, 미디어 플레이어, 마이크로 서버, GPS(global positioning system) 장치, 전자책 단말기, 디지털방송용 단말기, 네비게이션, 키오스크, MP3 플레이어, 디지털 카메라 및 기타 모바일 또는 비모바일 컴퓨팅 장치일 수 있으나, 이에 제한되지 않는다. 또한, 디바이스(2000)는 통신 기능 및 데이터 프로세싱 기능을 구비한 전등, 에어컨, TV, 로봇 청소기, 세탁기, 체중계, 냉장고, 셋톱 박스(set-top box), 홈 오토메이션 컨트롤 패널(home automation control panel), 보안 컨트롤 패널(security control panel), 게임 콘솔, 전자 키, 캠코더(camcorder), 또는 전자 액자 등의 가전 기기일 수 있다. 또한, 디바이스(2000)는 통신 기능 및 데이터 프로세싱 기능을 구비한 시계, 안경, 헤어 밴드 및 반지 등의 웨어러블 디바이스일 수 있다. 그러나, 이에 제한되지 않으며, 디바이스(1000)는 보이스 어시스턴트 서버(3000) 및/또는 IoT 클라우드 서버(4000)로부터 네트워크를 통하여 데이터를 송수신할 수 있는 모든 종류의 기기를 포함할 수 있다.The device 2000 includes a smart phone, a tablet PC, a PC, a smart TV, a mobile phone, a personal digital assistant (PDA), a laptop, a media player, a micro server, a global positioning system (GPS) device, an e-book terminal, a digital broadcasting terminal, Navigation, kiosk, MP3 player, digital camera, and other mobile or non-mobile computing devices may be, but are not limited thereto. In addition, the device 2000 includes a lamp, air conditioner, TV, robot cleaner, washing machine, weight scale, refrigerator, set-top box, and home automation control panel with communication and data processing functions. , A security control panel, a game console, an electronic key, a camcorder, or an electronic frame. In addition, the device 2000 may be a wearable device such as a watch, glasses, hair band, and ring having a communication function and a data processing function. However, the present invention is not limited thereto, and the device 1000 may include all types of devices capable of transmitting and receiving data from the voice assistant server 3000 and/or the IoT cloud server 4000 through a network.

네트워크(200)는 근거리 통신망(Local Area Network; LAN), 광역 통신망(Wide Area Network; WAN), 부가가치 통신망(Value Added Network; VAN), 이동 통신망(mobile radio communication network), 위성 통신망 및 이들의 상호 조합을 포함하며, 도 1에 도시된 각 네트워크 구성 주체가 서로 원활하게 통신을 할 수 있도록 하는 포괄적인 의미의 데이터 통신망이며, 유선 인터넷, 무선 인터넷 및 모바일 무선 통신망을 포함한다. 무선 통신은 예를 들어, 무선 랜(Wi-Fi), 블루투스, 블루투스 저 에너지(Bluetooth low energy), 지그비, WFD(Wi-Fi Direct), UWB(ultra wideband), 적외선 통신(IrDA, infrared Data Association), NFC(Near Field Communication) 등이 있을 수 있으나, 이에 한정되는 것은 아니다.The network 200 includes a local area network (LAN), a wide area network (WAN), a value added network (VAN), a mobile radio communication network, a satellite communication network, and a mutual It includes a combination, and is a data communication network in a comprehensive meaning that enables each network member shown in FIG. 1 to communicate with each other smoothly, and includes a wired Internet, a wireless Internet, and a mobile wireless communication network. Wireless communication is, for example, wireless LAN (Wi-Fi), Bluetooth, Bluetooth low energy (Bluetooth low energy), Zigbee, WFD (Wi-Fi Direct), UWB (ultra wideband), infrared communication (IrDA, infrared Data Association). ), NFC (Near Field Communication), and the like, but are not limited thereto.

도 2는 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버가 기등록된 디바이스의 기능에 기초하여 신규 디바이스를 등록하는 예시를 나타내는 도면이다.2 is a diagram illustrating an example in which a voice assistant server according to an embodiment of the present disclosure registers a new device based on a function of a previously registered device.

도 2를 참조하면, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)인 에어컨 B가 식별되면 에어컨 B의 기능 정보를 획득하고, 기 등록된 디바이스(2000)인 에어컨 A의 기능 정보 및 제습 장치 A의 기능 정보를 에어컨 B의 기능 정보와 비교할 수 있다.Referring to FIG. 2, when the new device 2900, the air conditioner B, is identified, the voice assistant server 3000 acquires function information of the air conditioner B, and the previously registered device 2000, the function information of the air conditioner A, and the dehumidification device A The function information of air conditioner B can be compared with the function information of air conditioner B

보이스 어시스턴트 서버(3000)는 에어컨 B의 기능들을 에어컨 A의 기능들 및 제습 장치 A의 기능들과 비교할 수 있으며, 에어컨 B의 기능들 중에서 에어컨 A의 기능들과 제습 장치 A의 기능들과 동일 또는 유사한 기능들을 식별할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 에어컨 B의 기능들 중 “전원 ON/OFF”, “냉방 모드 ON/OFF”, “제습 모드 ON/OFF”, “온도 UP/DOWN” 및 “습도 조절”이, 에어컨 A의 기능들 및 제습 장치 A의 기능들과 일치함을 식별할 수 있다.The voice assistant server 3000 may compare the functions of the air conditioner B with the functions of the air conditioner A and the functions of the dehumidification device A, and among the functions of the air conditioner B, the functions of the air conditioner A and the functions of the dehumidification device A are the same or Similar functions can be identified. For example, the voice assistant server 3000, among the functions of air conditioner B, is “power ON/OFF”, “cooling mode ON/OFF”, “dehumidification mode ON/OFF”, “temperature UP/DOWN” and “humidity control” It can be identified that this is consistent with the functions of the air conditioner A and the functions of the dehumidifying device A.

보이스 어시스턴트 서버(3000)는 식별된 기능들 중 적어도 하나에 대응되는 발화 데이터들을 결정하고, 결정된 발화 데이터에 대응되는 액션 데이터를 생성할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는, 에어컨 A의 “전원 ON”에 대응되는 발화 데이터인 “전원 켜.”, 에어컨 A의 “냉방 모드 ON, 온도 DOWN”에 대응되는 발화 데이터인 “온도 낮춰.”, 에어컨 A의 “냉방 모드 ON, 온도 UP”에 대응되는 발화 데이터인 “온도 높여.”, 제습 장치 A의 “전원 ON”에 대응되는 발화 데이터인 “전원 켜”, 및 제습 장치 A의 “습도 DOWN”에 대응되는 발화 데이터인 “습도 낮춰”를 이용하여, 에어컨 B의 기능들 중 적어도 하나에 대응되는 발화 데이터를 생성 또는 편집할 수 있다.The voice assistant server 3000 may determine speech data corresponding to at least one of the identified functions, and generate action data corresponding to the determined speech data. For example, the voice assistant server 3000, the ignition data "power on" corresponding to the "power ON" of the air conditioner A, and the "temperature", which is the ignition data corresponding to the "cooling mode ON, temperature down" of the air conditioner A Lower it.”, “Temperature increase”, which is the ignition data corresponding to “Cooling mode ON, temperature UP” of air conditioner A, “Power on”, which is the ignition data corresponding to “Power ON” of dehumidifying unit A, and dehumidifying unit A By using "lower humidity", which is the ignition data corresponding to "Humidity DOWN", ignition data corresponding to at least one of the functions of the air conditioner B can be created or edited.

도 3은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버(3000)가 신규 디바이스를 등록하는 방법의 흐름도이다.3 is a flowchart of a method for registering a new device by the voice assistant server 3000 according to an embodiment of the present disclosure.

동작 S300에서 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900) 및 기등록된 디바이스(2000)의 기능들에 관한 기능 정보를 획득할 수 있다. 보이스 어시스턴트 서비스를 위한 시스템에 신규 디바이스(2900)가 추가되면, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900) 또는 신규 디바이스(2900)와 연결된 외부 서버(미도시)로부터 획득한 신규 디바이스(2900)의 명세 정보(specification)로부터 신규 디바이스(2900)에 의해 지원되는 기능들을 식별할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 도 12에서와 같이, 디바이스의 식별자, 수행 가능한 기능의 명칭, 수행 가능한 기능에 대한 설명 및 기능 수행에 필요한 인자에 대한 정보를 포함하는 명세 정보로부터 신규 디바이스(2900)에 의해 지원되는 기능을 식별할 수 있다.In operation S300, the voice assistant server 3000 may acquire function information regarding functions of the new device 2900 and the previously registered device 2000. When a new device 2900 is added to the system for the voice assistant service, the voice assistant server 3000 is a new device 2900 or a new device 2900 obtained from an external server (not shown) connected to the new device 2900. Functions supported by the new device 2900 may be identified from the specification information of. For example, as shown in FIG. 12, the voice assistant server 3000 is a new device from specification information including information on a device identifier, a name of a function that can be performed, a description of a function that can be performed, and a factor necessary for performing the function. The functions supported by (2900) can be identified.

또한, 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 명세 정보(specification)로부터 디바이스(2000)의 기능들을 식별할 수 있다. 보이스 어시스턴트 서버(3000)는 보이스 어시스턴트 서버(3000)의 DB(미도시)에 저장된 명세 정보로부터 디바이스(2000)의 기능들을 식별할 수 있다. 또는, 보이스 어시스턴트 서버(3000)는 IoT 클라우드 서버(4000)의 DB(미도시)에 저장된 디바이스(2000)의 명세 정보를 IoT 클라우드 서버(4000)로부터 수신하고, 수신된 명세 정보로부터 디바이스(2000)의 기능들을 식별할 수 있다. 기등록된 디바이스(2000)의 명세 정보는 신규 디바이스(2900)의 명세 정보와 같이, 디바이스의 식별자, 수행 가능한 기능의 명칭, 수행 가능한 기능에 대한 설명 및 기능 수행에 필요한 인자에 대한 정보를 포함할 수 있다.In addition, the voice assistant server 3000 may identify functions of the device 2000 from specifications of the previously registered device 2000. The voice assistant server 3000 may identify functions of the device 2000 from specification information stored in a DB (not shown) of the voice assistant server 3000. Alternatively, the voice assistant server 3000 receives specification information of the device 2000 stored in a DB (not shown) of the IoT cloud server 4000 from the IoT cloud server 4000, and the device 2000 from the received specification information Can identify the functions of The specification information of the previously registered device 2000, like the specification information of the new device 2900, includes an identifier of a device, a name of a function that can be performed, a description of a function that can be performed, and information on factors necessary for performing the function. I can.

동작 S310에서 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 동일 또는 유사한지를 판단할 수 있다. 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능을 비교함으로써, 신규 디바이스(2900)의 기능들 중에서 기등록된 디바이스(2000)의 기능과 동일 또는 유사한 기능을 식별할 수 있다.In operation S310, the voice assistant server 3000 may determine whether the function of the previously registered device 2000 and the function of the new device 2900 are the same or similar. The voice assistant server 3000 compares the functions of the previously registered device 2000 with the functions of the new device 2900, so that the functions of the new device 2900 are the same as or similar to the functions of the previously registered device 2000. Function can be identified.

보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 명세 정보로부터 신규 디바이스(2900)에 의해 지원되는 기능을 나타내는 명칭을 식별하고, 식별된 명칭이 기등록된 디바이스(2000)에 의해 지원되는 기능의 명칭과 동일 또는 유사한 지를 판단할 수 있다. 이 경우, 보이스 어시스턴트 서버(3000)는 소정 기능을 나타내는 명칭 및 유사어들에 관한 정보를 미리 저장할 수 있으며, 저장된 유사어 정보에 기초하여 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 동일 또는 유사한지를 판단할 수 있다.The voice assistant server 3000 identifies a name representing a function supported by the new device 2900 from the specification information of the new device 2900, and the identified name is of the function supported by the previously registered device 2000. It can be determined whether it is the same or similar to the name. In this case, the voice assistant server 3000 may pre-store information on names and similar words indicating a predetermined function, and based on the stored similar word information, the functions of the previously registered device 2000 and the functions of the new device 2900 It can be determined whether these are the same or similar.

또한, 보이스 어시스턴트 서버(3000)는 발화 데이터를 참고하여 기능의 동일 유사 여부를 판단할 수 있다. 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능과 관련된 발화 데이터를 이용하여, 신규 디바이스(2900)의 기능이 기등록된 디바이스(2000)의 기능과 동일 또는 유사한지를 판단할 수 있다. 이 경우, 보이스 어시스턴트 서버(3000)는 발화 데이터 내에 포함된 단어들의 의미에 기초하여 신규 디바이스(2900)의 기능이 기등록된 디바이스(2000)의 기능과 동일 또는 유사한지를 판단할 수 있다. Also, the voice assistant server 3000 may determine whether or not the functions are identical or similar with reference to the speech data. The voice assistant server 3000 may determine whether the function of the new device 2900 is the same as or similar to the function of the previously registered device 2000 by using speech data related to the function of the previously registered device 2000. . In this case, the voice assistant server 3000 may determine whether the function of the new device 2900 is the same as or similar to the function of the previously registered device 2000 based on the meaning of words included in the speech data.

보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 단일 기능과 신규 디바이스(2900)의 단일 기능이 동일 또는 유사한 지를 판단할 수 있다. 단일 기능은, 예를 들어, ‘전원 ON’, ‘전원 OFF’, ‘온도 UP’ 및 ‘온도 DOWN’과 같은 하나의 기능일 수 있다. 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능 세트와 신규 디바이스(2900)의 기능 세트가 동일 또는 유사한 지를 판단할 수 있다. 기능 세트는 단일 기능들의 세트이며, 예를 들어, ‘전원 ON + 온도 UP’, ‘온도 DOWN + 습도 DOWM’과 같은 기능들의 조합일 수 있다.The voice assistant server 3000 may determine whether a single function of the previously registered device 2000 and a single function of the new device 2900 are the same or similar. The single function may be, for example, one function such as'power ON','power OFF','temperature UP' and'temperature DOWN'. The voice assistant server 3000 may determine whether the function set of the previously registered device 2000 and the function set of the new device 2900 are the same or similar. The function set is a set of single functions, for example, may be a combination of functions such as "Power ON + Temperature UP" and "Temperature DOWN + Humidity DOWM".

동작 S310에서 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 동일 또는 유사하다고 판단되면, 동작 S320에서 보이스 어시스턴트 서버(3000)는 동일 또는 유사한 기능들에 관련된 기등록된 발화 데이터를 획득할 수 있다.If it is determined in operation S310 that the function of the previously registered device 2000 and the function of the new device 2900 are the same or similar, in operation S320, the voice assistant server 3000 uses pre-registered speech data related to the same or similar functions. Can be obtained.

보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능들 중에서 신규 디바이스(2900)의 기능과 동일 또는 유사하다고 판단된 기능에 대응되는 발화 데이터를 DB(미도시)로부터 추출할 수 있다. The voice assistant server 3000 may extract speech data corresponding to a function determined to be the same as or similar to the function of the new device 2900 among functions of the previously registered device 2000 from a DB (not shown).

보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능 세트들 중에서 신규 디바이스(2900)의 기능과 동일 또는 유사하다고 판단된 기능 세트에 대응되는 발화 데이터를 DB(미도시)로부터 추출할 수 있다.The voice assistant server 3000 may extract speech data corresponding to a function set determined to be the same or similar to the function of the new device 2900 from among the function sets of the previously registered device 2000 from a DB (not shown). have.

이 경우, 기등록된 디바이스(2000)의 기능에 대응되는 발화 데이터, 및 기등록된 디바이스(2000)의 기능 세트에 대응되는 발화 데이터는, DB(미도시)에 미리 저장되어 있을 수 있다.In this case, speech data corresponding to a function of the previously registered device 2000 and speech data corresponding to a function set of the previously registered device 2000 may be previously stored in a DB (not shown).

한편, 보이스 어시스턴트 서버(3000)는 동일 또는 유사하다고 판단된 기능 및 기능 세트를 편집하고 편집된 기능들에 대응되는 발화 데이터를 생성할 수도 있다. 보이스 어시스턴트 서버(3000)는 동일 또는 유사하다고 판단된 기능들을 조합하고, 조합된 기능들에 대응되는 발화 데이터를 생성할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 동일 또는 유사하다고 판단된 기능 및 기능 세트를 조합하고, 조합된 기능들에 대응되는 발화 데이터를 생성할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 동일 또는 유사하다고 판단된 기능 세트 내의 기능들 중 일부 기능을 삭제하고, 일부 기능이 삭제된 기능 세트에 대응되는 발화 데이터를 생성할 수 있다.Meanwhile, the voice assistant server 3000 may edit a function and a set of functions determined to be identical or similar, and may generate speech data corresponding to the edited functions. The voice assistant server 3000 may combine functions determined to be the same or similar, and generate speech data corresponding to the combined functions. In addition, the voice assistant server 3000 may combine a function and a set of functions determined to be the same or similar, and generate speech data corresponding to the combined functions. In addition, the voice assistant server 3000 may delete some functions from among functions in the function set determined to be identical or similar, and may generate speech data corresponding to the function set from which some functions have been deleted.

보이스 어시스턴트 서버(3000)는 발화 데이터를 확장할 수 있다. 보이스 어시스턴트 서버(3000)는 추출 또는 생성된 발화 데이터의 표현을 수정함으로써, 추출 또는 생성된 발화 데이터와 의미는 동일하지만 상이한 표현을 가지는 유사 발화 데이터를 생성할 수 있다.The voice assistant server 3000 may expand speech data. The voice assistant server 3000 may generate similar speech data having the same meaning as the extracted or generated speech data but having a different expression by modifying the expression of the extracted or generated speech data.

동작 S330에서 보이스 어시스턴트 서버(3000)는 동일 또는 유사한 기능들 및 발화 데이터에 기초하여, 신규 디바이스(2900)에 대한 액션 데이터를 생성할 수 있다. 액션 데이터는 발화 데이터에 따른 디바이스의 세부 동작들 및 세부 동작들의 실행 순서를 나타내는 데이터일 수 있다. 액션 데이터는, 예를 들어, 세부 동작들의 식별 값, 세부 동작들의 실행 순서 및 세부 동작을 실행하기 위한 제어 명령 등을 포함할 수 있으나, 이에 제한되지 않는다.In operation S330, the voice assistant server 3000 may generate action data for the new device 2900 based on the same or similar functions and speech data. The action data may be data indicating detailed operations of the device and an execution order of detailed operations according to the speech data. The action data may include, for example, an identification value of detailed operations, an execution order of detailed operations, a control command for executing a detailed operation, and the like, but is not limited thereto.

예를 들어, 발화 데이터에 대응되는 기능이 단일 기능인 경우에, 보이스 어시스턴트 서버(3000)는 단일 기능을 나타내는 세부 동작을 포함하는 액션 데이터를 생성할 수 있다. 예를 들어, 발화 데이터에 대응되는 기능이 기능 세트인 경우에, 보이스 어시스턴트 서버(3000)는 기능 세트 내의 기능들을 나타내는 세부 동작들, 및 세부 동작들의 실행 순서를 생성할 수 있다.For example, when the function corresponding to the speech data is a single function, the voice assistant server 3000 may generate action data including a detailed operation indicating a single function. For example, when a function corresponding to speech data is a function set, the voice assistant server 3000 may generate detailed operations representing functions in the function set, and an execution order of the detailed operations.

동작 S340에서 보이스 어시스턴트 서버(3000)는 발화 데이터 및 액션 데이터를 이용하여 신규 디바이스(2900)에 관련된 보이스 어시스턴트 모델을 생성 또는 업데이트할 수 있다.In operation S340, the voice assistant server 3000 may create or update a voice assistant model related to the new device 2900 using speech data and action data.

보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능에 관련된 기 등록된 디바이스(2000)의 기능에 대응되는 발화 데이터, 신규 디바이스(2900)의 기능과 관련하여 신규로 생성된 발화 데이터, 확장된 발화 데이터 및 액션 데이터를 이용하여, 신규 디바이스(2900)에 관련된 보이스 어시스턴트 모델을 생성 또는 업데이트할 수 있다. 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)에 관련된 발화 데이터 및 액션 데이터를 누적하여 저장할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 액션 플랜 관리 모델(미도시) 내에 포함된 캡슐 형태의 데이터베이스인 CAN(Concept Action Network)를 생성 또는 업데이트할 수 있다.The voice assistant server 3000 includes speech data corresponding to the functions of the previously registered device 2000 related to the functions of the new device 2900, speech data newly generated in relation to the functions of the new device 2900, and extended Using the speech data and action data, a voice assistant model related to the new device 2900 may be created or updated. The voice assistant server 3000 may accumulate and store speech data and action data related to the new device 2900. In addition, the voice assistant server 3000 may create or update a CAN (Concept Action Network), which is a capsule-type database included in an action plan management model (not shown).

신규 디바이스(2900)에 관련된 보이스 어시스턴트 모델은, 보이스 어시스턴트 서비스를 위하여 이용되는 모델로서 신규 디바이스(2900)에 특화된 모델로서 사용자의 음성 입력에 대응하는 타겟 디바이스가 수행할 동작을 결정하는 모델일 수 있다. 신규 디바이스(2900)에 관련된 보이스 어시스턴트 모델은, 예를 들어, NLU 모델, NLG 모델, 및 액션 플랜 관리 모델을 포함할 수 있다. 신규 디바이스(2900)에 관련된 NLU 모델은 신규 디바이스(2900)의 기능을 고려하여 사용자의 입력 음성을 해석하기 위한 인공 지능 모델이며, 신규 디바이스(2900)에 관련된 NLG 모델은 신규 디바이스의 기능을 고려하여 사용자와의 대화를 위한 자연어를 생성하기 위한 인공 지능 모델일 수 있다. 또한, 신규 디바이스(2900)에 관련된 액션 플랜 관리 모델은 신규 디바이스(2900)의 기능을 고려하여 신규 디바이스(2900)가 수행할 동작 정보를 플래닝하는 모델일 수 있다. 액션 플랜 관리 모델은 해석된 사용자의 발화 음성으로부터 신규 디바이스(2900)가 수행해야 할 세부 동작들을 선택하고 선택된 세부 동작들의 실행 순서를 플래닝할 수 있다. 액션 플랜 관리 모델은 플래닝 결과를 이용하여 신규 디바이스(2900)가 수행할 세부 동작에 관한 동작 정보를 획득할 수 있다. 동작 정보는, 디바이스가 수행할 세부 동작들, 세부 동작들 간의 연관 관계, 및 세부 동작들의 실행 순서와 관련된 정보일 수 있다. 동작 정보는 예를 들어, 세부 동작들의 수행을 위하여 신규 디바이스(2900)가 실행해야 할 기능들, 기능들의 실행 순서, 기능들을 실행하기 위하여 필요한 입력 값 및 기능들의 실행 결과로서 출력되는 출력 값을 포함할 수 있으나, 이에 한정되지 않는다.The voice assistant model related to the new device 2900 is a model used for the voice assistant service and is a model specialized for the new device 2900 and may be a model that determines an operation to be performed by a target device corresponding to a user's voice input. . The voice assistant model related to the new device 2900 may include, for example, an NLU model, an NLG model, and an action plan management model. The NLU model related to the new device 2900 is an artificial intelligence model for interpreting the user's input voice in consideration of the function of the new device 2900, and the NLG model related to the new device 2900 considers the function of the new device. It may be an artificial intelligence model for generating natural language for dialogue with a user. Also, the action plan management model related to the new device 2900 may be a model for planning operation information to be performed by the new device 2900 in consideration of functions of the new device 2900. The action plan management model may select detailed operations to be performed by the new device 2900 from the interpreted user's spoken voice, and plan an execution order of the selected detailed operations. The action plan management model may acquire operation information on a detailed operation to be performed by the new device 2900 by using the planning result. The operation information may be information related to detailed operations to be performed by the device, a correlation relationship between detailed operations, and an execution order of the detailed operations. The operation information includes, for example, functions to be executed by the new device 2900 in order to perform detailed operations, an execution order of functions, an input value required to execute the functions, and an output value output as a result of execution of the functions. It can, but is not limited thereto.

신규 디바이스(2900)에 대하여 이용될 수 있는 보이스 어시스턴트 모델이 이미 존재하는 경우에는, 보이스 어시스턴트 서버(3000)는 보이스 어시스턴트 모델을 업데이트할 수 있다.When a voice assistant model that can be used for the new device 2900 already exists, the voice assistant server 3000 may update the voice assistant model.

또한, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능에 관련된 기 등록된 디바이스(2000)의 기능에 대응되는 발화 데이터, 신규 디바이스(2900)의 기능과 관련하여 신규로 생성된 발화 데이터, 확장된 발화 데이터 및 액션 데이터를 이용하여, 신규 디바이스(2900)에 관련된 보이스 어시스턴트 모델을 생성 또는 업데이트할 수 있다.In addition, the voice assistant server 3000 includes utterance data corresponding to the function of the previously registered device 2000 related to the function of the new device 2900, utterance data newly generated in relation to the function of the new device 2900, Using the expanded speech data and action data, a voice assistant model related to the new device 2900 may be created or updated.

또한, 액션 플랜 관리 모델은 신규 디바이스(2900)의 복수의 세부 동작들 및 복수의 세부 동작들 간의 관계에 관한 정보를 관리할 수 있다. 복수의 세부 동작들 중 각각의 세부 동작과 다른 세부 동작과의 연관 관계는, 하나의 세부 동작을 실행하기 위해서 그 세부 동작을 실행하기 전에 필수적으로 실행되어야 할 다른 세부 동작에 대한 정보를 포함할 수 있다.In addition, the action plan management model may manage information about a plurality of detailed operations of the new device 2900 and a relationship between the plurality of detailed operations. The relationship between each detailed operation and another detailed operation among a plurality of detailed operations may include information on other detailed operations that must be executed before executing the detailed operation in order to execute one detailed operation. have.

일 실시예에서, 액션 플랜 관리 모델은 디바이스의 동작들 및 동작들 간의 연관 관계를 나타내는 캡슐 형태의 데이터베이스인 CAN(Concept Action Network)를 포함할 수 있다. CAN(Concept Action Network)은 특정 동작의 수행을 위하여 디바이스가 실행해야 할 기능들, 기능들의 실행 순서, 기능들을 실행하기 위하여 필요한 입력 값 및 기능들의 실행 결과로서 출력되는 출력 값을 포함하며, 컨셉 및 컨셉 간의 관계를 나타내는 지식 트리플들로 구성된 온톨로지 그래프로 구현될 수 있다. In an embodiment, the action plan management model may include a Concept Action Network (CAN), which is a capsule-type database representing the actions of the device and the relationship between the actions. CAN (Concept Action Network) includes functions to be executed by the device to perform a specific operation, order of execution of functions, input values necessary to execute functions, and output values output as a result of execution of functions. It can be implemented as an ontology graph composed of knowledge triples representing the relationship between concepts.

동작 S310에서 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 동일 또는 유사하지 않다고 판단되면, 동작 S350에서 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능과 상이한 기능에 대한 발화 데이터 및 액션 데이터의 입력을 요청할 수 있다. 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 상이한 기능을 등록하고 상이한 기능에 관련된 발화 데이터를 생성하고 편집하기 위한 질의 메시지를 출력할 수 있다. 질의 메시지는 클라이언트 디바이스(1000), 신규 디바이스(2900) 또는 개발자의 디바이스(미도시)에게 제공될 수 있다. 보이스 어시스턴트 서버(3000)는 질의 메시지에 대한 응답을 클라이언트 디바이스(1000), 신규 디바이스(2900) 또는 개발자의 디바이스(미도시)로부터 수신할 수 있다. 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능을 등록하기 위한 SDK(Software Development Kit) 툴을 클라이언트 디바이스(1000), 신규 디바이스(2900) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능들 중에서 기등록된 디바이스(2000)의 기능과 상이한 기능들의 목록을 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 보이스 어시스턴트 서버(3000)는 상이한 기능들 중 적어도 일부에 관련된 추천 발화 데이터를 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다.If it is determined in operation S310 that the function of the previously registered device 2000 and the function of the new device 2900 are not the same or similar, the voice assistant server 3000 in operation S350 is different from the function of the previously registered device 2000. Input of speech data and action data for a function can be requested. The voice assistant server 3000 may register different functions of the new device 2900 and output a query message for generating and editing speech data related to the different functions. The query message may be provided to the client device 1000, the new device 2900, or the developer's device (not shown). The voice assistant server 3000 may receive a response to the query message from the client device 1000, the new device 2900, or the developer's device (not shown). The voice assistant server 3000 may provide a software development kit (SDK) tool for registering the function of the new device 2900 to the client device 1000, the new device 2900, or a developer's device (not shown). . In addition, the voice assistant server 3000 provides a list of functions different from those of the previously registered device 2000 among the functions of the new device 2900 to the user's device 2000 or the developer's device (not shown). I can. The voice assistant server 3000 may provide recommended speech data related to at least some of different functions to the user's device 2000 or the developer's device (not shown).

동작 S360에서 보이스 어시스턴트 서버(3000)는 발화 데이터 및 액션 데이터를 획득할 수 있다. 보이스 어시스턴트 서버(3000)는 NLU 모델을 이용하여 질의에 대한 응답을 해석할 수 있다. 보이스 어시스턴트 서버(3000)는 기능 등록 및 발화 데이터 생성을 위해 훈련된 NLU 모델을 이용하여, 사용자의 응답 또는 개발자의 응답을 해석할 수 있다. 보이스 어시스턴트 서버(3000)는 해석된 응답에 기초하여, 신규 디바이스(2900)의 기능들에 관련된 발화 데이터를 생성할 수 있다. 보이스 어시스턴트 서버(3000)는 해석된 사용자의 응답 또는 해석된 개발자의 응답을 이용하여 신규 디바이스(2900)의 기능들에 관련된 발화 데이터를 생성하고, 생성된 발화 데이터를 추천할 수 있다. 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능들 중 일부를 선택하고, 선택된 일부 기능 각각에 관련된 발화 데이터들을 생성할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능들 중 일부를 선택하고, 선택된 일부 기능들의 조합에 관련된 발화 데이터를 생성할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 생성된 발화 데이터와 동일한 의미를 가지면서 상이한 표현을 가지는 유사 발화 데이터를 생성할 수 있다. 보이스 어시스턴트 서버(3000)는 생성된 발화 데이터를 이용하여 액션 데이터를 생성할 수 있다. 보이스 어시스턴트 서버(3000)는 생성된 발화 데이터에 관련된 신규 디바이스(2900)의 기능들을 식별하고, 식별된 기능들의 실행 순서를 결정함으로써, 생성된 발화 데이터에 대응되는 액션 데이터를 생성할 수 있다. In operation S360, the voice assistant server 3000 may acquire speech data and action data. The voice assistant server 3000 may analyze a response to a query using the NLU model. The voice assistant server 3000 may interpret a user's response or a developer's response by using an NLU model trained for function registration and generation of speech data. The voice assistant server 3000 may generate speech data related to functions of the new device 2900 based on the interpreted response. The voice assistant server 3000 may generate speech data related to functions of the new device 2900 by using an interpreted user's response or an interpreted developer's response, and recommend the generated speech data. The voice assistant server 3000 may select some of the functions of the new device 2900 and generate speech data related to each of the selected partial functions. Also, the voice assistant server 3000 may select some of the functions of the new device 2900 and generate speech data related to a combination of the selected partial functions. In addition, the voice assistant server 3000 may generate similar speech data having the same meaning as the generated speech data and different expressions. The voice assistant server 3000 may generate action data using the generated speech data. The voice assistant server 3000 may generate action data corresponding to the generated speech data by identifying functions of the new device 2900 related to the generated speech data and determining an execution order of the identified functions.

도 4는 본 개시의 일 실시예에 따른 서버가 기설정된 디바이스의 기능 및 신규 디바이스의 기능을 비교하는 방법의 흐름도이다.4 is a flowchart illustrating a method of comparing, by a server, a function of a preset device and a function of a new device according to an embodiment of the present disclosure.

동작 S400에서 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능 및 신규 디바이스(2900)의 기능을 비교할 수 있다. 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)에 의해 지원되는 기능의 명칭을 기등록된 디바이스(2000)에 의해 지원되는 기능의 명칭과 비교할 수 있다. 이 경우, 보이스 어시스턴트 서버(3000)는 소정 기능을 나타내는 명칭 및 유사어들에 관한 정보를 미리 저장할 수 있으며, 저장된 유사어 정보에 기초하여 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능을 비교할 수 있다.In operation S400, the voice assistant server 3000 may compare the functions of the previously registered device 2000 and the functions of the new device 2900. The voice assistant server 3000 may compare the name of the function supported by the new device 2900 with the name of the function supported by the previously registered device 2000. In this case, the voice assistant server 3000 may pre-store information on names and similar words indicating a predetermined function, and based on the stored similar word information, the functions of the previously registered device 2000 and the functions of the new device 2900 Can be compared.

또한, 보이스 어시스턴트 서버(3000)는 IoT 클라우드 서버(4000)에 저장된 발화 데이터를 참고하여, 기등록된 디바이스(2000)의 기능 및 신규 디바이스(2900)의 기능을 비교할 수 있다. 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능과 관련된 발화 데이터를 이용하여, 신규 디바이스(2900)의 기능이 기등록된 디바이스(2000)의 기능과 동일 또는 유사한지를 판단할 수 있다. 이 경우, 보이스 어시스턴트 서버(3000)는 발화 데이터 내에 포함된 단어들의 의미에 기초하여 신규 디바이스(2900)의 기능이 기등록된 디바이스(2000)의 기능과 동일 또는 유사한지를 판단할 수 있다.In addition, the voice assistant server 3000 may compare the functions of the previously registered device 2000 and the functions of the new device 2900 with reference to speech data stored in the IoT cloud server 4000. The voice assistant server 3000 may determine whether the function of the new device 2900 is the same as or similar to the function of the previously registered device 2000 by using speech data related to the function of the previously registered device 2000. . In this case, the voice assistant server 3000 may determine whether the function of the new device 2900 is the same as or similar to the function of the previously registered device 2000 based on the meaning of words included in the speech data.

동작 S405에서 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재하는 지를 판단할 수 있다. 보이스 어시스턴트 서버(3000)는 적어도 하나의 기등록된 디바이스(2000)의 기능들과 신규 디바이스(2900)의 기능들이 모두 일치하는지를 판단할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 제1 디바이스(2100)의 기능들, 제2 디바이스(2200)의 기능들 및 제3 디바이스(2300)의 기능들 중에서 신규 디바이스(2900)의 기능들과 일치하는 기능들을 식별할 수 있다.In operation S405, the voice assistant server 3000 may determine whether a function that does not match the function of the previously registered device 2000 exists among the functions of the new device 2900. The voice assistant server 3000 may determine whether the functions of the at least one previously registered device 2000 and the functions of the new device 2900 all match. For example, the voice assistant server 3000 includes functions of the new device 2900 among functions of the first device 2100, functions of the second device 2200, and functions of the third device 2300. Matching functions can be identified.

신규 디바이스(2900)의 기능의 명칭이 기등록된 디바이스(2000)의 기능의 명칭과 동일한 경우에, 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 일치한다고 판단할 수 있다. When the name of the function of the new device 2900 is the same as the name of the function of the previously registered device 2000, the voice assistant server 3000 uses the function of the previously registered device 2000 and the function of the new device 2900. It can be judged that this coincides.

또한, 신규 디바이스(2900)의 기능의 명칭이 기등록된 디바이스(2000)의 기능의 명칭과 유사하며 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 동일한 목적의 기능이라고 판단되는 경우에, 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 일치한다고 판단할 수 있다.In addition, it is determined that the function name of the new device 2900 is similar to the function name of the previously registered device 2000 and that the function of the previously registered device 2000 and the function of the new device 2900 are functions for the same purpose. In this case, the voice assistant server 3000 may determine that the function of the previously registered device 2000 and the function of the new device 2900 match.

동작 S405에서 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재하지 않는다고 판단되면, 보이스 어시스턴트 서버(3000)는 동작 S320 내지 동작 S340을 수행할 수 있다. 보이스 어시스턴트 서버(3000)는, 신규 디바이스(2900)의 기능과 일치하는 기등록된 디바이스(2000)의 기능에 관련된 발화 데이터 및 액션 데이터를 이용하여, 신규 디바이스(2900)에 기능에 관련된 발화 데이터 및 액션 데이터를 생성하고, 신규 디바이스(2900)에 관한 보이스 어시스턴트 서비스를 제공하기 위한 보이스 어시스턴트 모델을 생성 또는 업데이트할 수 있다.If it is determined in operation S405 that there is no function that does not match the function of the previously registered device 2000 among the functions of the new device 2900, the voice assistant server 3000 may perform operations S320 to S340. The voice assistant server 3000 uses the utterance data and action data related to the function of the previously registered device 2000 that matches the function of the new device 2900, and uses the utterance data and action data related to the function to the new device 2900. Action data may be generated, and a voice assistant model for providing a voice assistant service for the new device 2900 may be generated or updated.

동작 S405에서 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재한다고 판단되면, 동작 S410에서 보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능들을 조합할 수 있다.If it is determined in operation S405 that a function that does not match the function of the previously registered device 2000 exists among the functions of the new device 2900, the voice assistant server 3000 performs the function of the previously registered device 2000 in operation S410. You can combine them.

보이스 어시스턴트 서버(3000)는 적어도 하나의 디바이스(2000)의 단일 기능들을 조합할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 제1 디바이스(2100)의 제1 기능 및 제1 디바이스(2100)의 제2 기능을 조합할 수 있다. 또한, 예를 들어, 보이스 어시스턴트 서버(3000)는 제1 디바이스(2100)의 제1 기능 및 제2 디바이스(2200)의 제3 기능을 조합할 수 있다.The voice assistant server 3000 may combine single functions of at least one device 2000. For example, the voice assistant server 3000 may combine a first function of the first device 2100 and a second function of the first device 2100. Also, for example, the voice assistant server 3000 may combine a first function of the first device 2100 and a third function of the second device 2200.

보이스 어시스턴트 서버(3000)는 적어도 하나의 디바이스(2000)의 기능 세트들을 조합할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 제1 디바이스(2100)의 제1 기능 세트 및 제1 디바이스(2100)의 제2 기능 세트를 조합할 수 있다. 또한, 예를 들어, 보이스 어시스턴트 서버(3000)는 제1 디바이스(2100)의 제1 기능 세트 및 제2 디바이스(2200)의 제3 기능 세트를 조합할 수 있다.The voice assistant server 3000 may combine function sets of at least one device 2000. For example, the voice assistant server 3000 may combine a first function set of the first device 2100 and a second function set of the first device 2100. Also, for example, the voice assistant server 3000 may combine a first function set of the first device 2100 and a third function set of the second device 2200.

보이스 어시스턴트 서버(3000)는 적어도 하나의 디바이스(2000)의 단일 기능 및 기능 세트를 조합할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 제1 디바이스(2100)의 제1 기능 및 제1 디바이스(2100)의 제1 기능 세트를 조합할 수 있다. 또한, 예를 들어, 보이스 어시스턴트 서버(3000)는 제1 디바이스(2100)의 제1 기능 및 제2 디바이스(2200)의 제3 기능 세트를 조합할 수 있다.The voice assistant server 3000 may combine a single function and a function set of at least one device 2000. For example, the voice assistant server 3000 may combine a first function of the first device 2100 and a first function set of the first device 2100. Also, for example, the voice assistant server 3000 may combine a first function of the first device 2100 and a third function set of the second device 2200.

보이스 어시스턴트 서버(3000)는 기등록된 디바이스(2000)의 기능들에 대응되는 발화 데이터로부터 기등록된 디바이스(2000)의 기능들을 조합할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 제1 디바이스(2100)의 제1 기능에 대응되는 제1 발화 데이터 및 제2 기능에 대응되는 제2 발화 데이터를 DB(미도시)로부터 추출하고, 제1 발화 데이터 및 제2 발화 데이터의 의미에 기초하여, 제1 기능 및 제2 기능을 조합할 것을 결정할 수 있다. 또한, 예를 들어, 보이스 어시스턴트 서버(3000)는 제1 디바이스(2100)의 제1 기능에 대응되는 제1 발화 데이터 및 제2 디바이스(2200)의 제3 기능에 대응되는 제3 발화 데이터를 DB(미도시)로부터 추출하고, 제1 발화 데이터 및 제3 발화 데이터의 의미에 기초하여, 제1 기능 및 제3 기능을 조합할 것을 결정할 수 있다.The voice assistant server 3000 may combine functions of the previously registered device 2000 from speech data corresponding to functions of the previously registered device 2000. For example, the voice assistant server 3000 extracts first speech data corresponding to a first function of the first device 2100 and second speech data corresponding to a second function from a DB (not shown), and It may be determined to combine the first function and the second function based on the meaning of the first speech data and the second speech data. In addition, for example, the voice assistant server 3000 may store first speech data corresponding to the first function of the first device 2100 and third speech data corresponding to the third function of the second device 2200 to a DB. It may be extracted from (not shown) and determined to combine the first function and the third function based on the meaning of the first speech data and the third speech data.

동작 S415에서 보이스 어시스턴트 서버(3000)는 조합된 기능들과 신규 디바이스(2900)의 기능들을 비교할 수 있다. 보이스 어시스턴트 서버(3000)는 조합된 기능들의 명칭을 기등록된 디바이스(2000)에 의해 지원되는 기능들의 명칭과 비교할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 IoT 클라우드 서버(4000)에 저장된 발화 데이터를 참고하여, 조합된 기능들 및 신규 디바이스(2900)의 기능들을 비교할 수 있다.In operation S415, the voice assistant server 3000 may compare the combined functions with the functions of the new device 2900. The voice assistant server 3000 may compare names of the combined functions with names of functions supported by the previously registered device 2000. In addition, the voice assistant server 3000 may compare the combined functions and functions of the new device 2900 with reference to the speech data stored in the IoT cloud server 4000.

동작 S420에서, 동작 S415를 수행한 후, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재하는지를 판단할 수 있다. 조합된 기능들의 명칭들이 기등록된 디바이스(2000)의 기능들의 명칭들과 동일한 경우에, 보이스 어시스턴트 서버(3000)는 조합된 기능들이 신규 디바이스(2900)의 기능들과 일치한다고 판단할 수 있다. In operation S420, after performing operation S415, the voice assistant server 3000 may determine whether a function that does not match the function of the previously registered device 2000 exists among the functions of the new device 2900. When the names of the combined functions are the same as the names of the functions of the previously registered device 2000, the voice assistant server 3000 may determine that the combined functions match the functions of the new device 2900.

또한, 조합된 기능들의 명칭들이 기등록된 디바이스(2000)의 기능들의 명칭들과 유사하며 조합된 기능들과 신규 디바이스(2900)의 기능들이 동일한 목적의 기능들이라고 판단되는 경우에, 보이스 어시스턴트 서버(3000)는 조합된 기능들이 신규 디바이스(2900)의 기능들과 일치한다고 판단할 수 있다.In addition, when the names of the combined functions are similar to those of the previously registered device 2000 and it is determined that the combined functions and functions of the new device 2900 are functions of the same purpose, the voice assistant server The 3000 may determine that the combined functions match the functions of the new device 2900.

동작 S420에서 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재하지 않는다고 판단되면, 보이스 어시스턴트 서버(3000)는 동작 S320 내지 동작 S340을 수행할 수 있다. 보이스 어시스턴트 서버(3000)는, 신규 디바이스(2900)의 기능과 일치하는 기등록된 디바이스(2000)의 기능에 관련된 발화 데이터 및 액션 데이터, 및 조합된 기능에 관련된 발화 데이터 및 액션 데이터를 이용하여, 신규 디바이스(2900)에 기능들에 관련된 발화 데이터 및 액션 데이터를 생성하고, 신규 디바이스(2900)에 관한 보이스 어시스턴트 서비스를 제공하기 위한 보이스 어시스턴트 모델을 생성 또는 업데이트할 수 있다.If it is determined in operation S420 that there is no function that does not match the function of the previously registered device 2000 among the functions of the new device 2900, the voice assistant server 3000 may perform operations S320 to S340. The voice assistant server 3000 uses utterance data and action data related to a function of a previously registered device 2000 that matches the function of the new device 2900, and utterance data and action data related to a combined function, Speech data and action data related to functions may be generated in the new device 2900, and a voice assistant model may be generated or updated to provide a voice assistant service for the new device 2900.

동작 S420에서 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재한다고 판단되면, 동작 S425에서 보이스 어시스턴트 서버(3000)는 디바이스(2000)의 기능들 중 일부를 삭제할 수 있다.If it is determined in operation S420 that a function that does not match the function of the previously registered device 2000 exists among the functions of the new device 2900, the voice assistant server 3000 performs some of the functions of the device 2000 in operation S425. Can be deleted.

보이스 어시스턴트 서버(3000)는 적어도 하나의 디바이스(2000)의 단일 기능들 중 일부를 삭제할 수 있다. 보이스 어시스턴트 서버(3000)는 디바이스(2000)의 단일 기능들 중에서 신규 디바이스(2900)에 의해 지원되지 않는다고 판단되는 단일 기능을 삭제할 수 있다.The voice assistant server 3000 may delete some of the single functions of at least one device 2000. The voice assistant server 3000 may delete a single function that is determined not to be supported by the new device 2900 from among the single functions of the device 2000.

보이스 어시스턴트 서버(3000)는 적어도 하나의 디바이스(2000)의 기능 세트들 내의 기능들 중 일부를 삭제할 수 있다. 보이스 어시스턴트 서버(3000)는 디바이스(2000)의 기능 세트들 내의 기능들 중에서 신규 디바이스(2900)에 의해 지원되지 않는다고 판단되는 기능을 삭제할 수 있다.The voice assistant server 3000 may delete some of the functions in the function sets of the at least one device 2000. The voice assistant server 3000 may delete a function that is determined not to be supported by the new device 2900 from among functions in the function sets of the device 2000.

보이스 어시스턴트 서버(3000)는 적어도 하나의 디바이스(2000)의 기능 세트들 중 일부를 삭제할 수 있다. 보이스 어시스턴트 서버(3000)는 디바이스(2000)의 기능 세트들 중에서 신규 디바이스(2900)에 의해 지원되지 않는다고 판단되는 기능 세트들을 삭제할 수 있다.The voice assistant server 3000 may delete some of the function sets of at least one device 2000. The voice assistant server 3000 may delete function sets determined to be not supported by the new device 2900 from among the function sets of the device 2000.

동작 S430에서 보이스 어시스턴트 서버(3000)는 삭제 후 남은 기능들과 신규 디바이스(2900)의 기능들을 비교할 수 있다. 보이스 어시스턴트 서버(3000)는 삭제 후 남은 기능들의 명칭을 기등록된 디바이스(2000)에 의해 지원되는 기능들의 명칭과 비교할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 IoT 클라우드 서버(4000)에 저장된 발화 데이터를 참고하여, 삭제 후 남은 기능들 및 신규 디바이스(2900)의 기능들을 비교할 수 있다.In operation S430, the voice assistant server 3000 may compare functions remaining after deletion with functions of the new device 2900. The voice assistant server 3000 may compare names of functions remaining after deletion with names of functions supported by the previously registered device 2000. In addition, the voice assistant server 3000 may compare the functions remaining after deletion with the functions of the new device 2900 by referring to the speech data stored in the IoT cloud server 4000.

동작 S435에서, 동작 S430를 수행한 후, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재하는지를 판단할 수 있다. 삭제 후 남은 기능들의 명칭들이 기등록된 디바이스(2000)의 기능들의 명칭들과 동일한 경우에, 보이스 어시스턴트 서버(3000)는 삭제 후 남은 기능들이 신규 디바이스(2900)의 기능들과 일치한다고 판단할 수 있다. In operation S435, after performing operation S430, the voice assistant server 3000 may determine whether a function that does not match the function of the previously registered device 2000 exists among the functions of the new device 2900. When the names of the functions remaining after deletion are the same as the names of the functions of the previously registered device 2000, the voice assistant server 3000 may determine that the functions remaining after the deletion match the functions of the new device 2900. have.

또한, 삭제 후 남은 기능들의 명칭들이 기등록된 디바이스(2000)의 기능들의 명칭들과 유사하며, 삭제 후 남은 기능들과 신규 디바이스(2900)의 기능들이 동일한 목적의 기능들이라고 판단되는 경우에, 보이스 어시스턴트 서버(3000)는 삭제 후 남은 기능들이 신규 디바이스(2900)의 기능들과 일치한다고 판단할 수 있다.In addition, if the names of functions remaining after deletion are similar to the names of functions of the previously registered device 2000, and it is determined that the functions remaining after deletion and the functions of the new device 2900 are functions of the same purpose, The voice assistant server 3000 may determine that functions remaining after deletion are consistent with functions of the new device 2900.

동작 S435에서 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재하지 않는다고 판단되면, 보이스 어시스턴트 서버(3000)는 동작 S320 내지 동작 S340을 수행할 수 있다. 보이스 어시스턴트 서버(3000)는, 신규 디바이스(2900)의 기능과 일치하는 기등록된 디바이스(2000)의 기능에 관련된 발화 데이터 및 액션 데이터, 조합된 기능에 관련된 발화 데이터 및 액션 데이터, 삭제 후 남은 기능에 관련된 발화 데이터 및 액션 데이터를 이용하여, 신규 디바이스(2900)에 기능들에 관련된 발화 데이터 및 액션 데이터를 생성하고, 신규 디바이스(2900)에 관한 보이스 어시스턴트 서비스를 제공하기 위한 보이스 어시스턴트 모델을 생성 또는 업데이트할 수 있다.If it is determined in operation S435 that there is no function that does not match the function of the previously registered device 2000 among the functions of the new device 2900, the voice assistant server 3000 may perform operations S320 to S340. The voice assistant server 3000 includes utterance data and action data related to a function of a previously registered device 2000 that matches the function of the new device 2900, utterance data and action data related to a combined function, and functions remaining after deletion. Creating a voice assistant model for generating speech data and action data related to functions in the new device 2900, and providing a voice assistant service for the new device 2900, using the speech data and action data related to Can be updated.

동작 S435에서 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재한다고 판단되면, 보이스 어시스턴트 서버(3000)는 동작 S350을 수행할 수 있다.If it is determined in operation S435 that a function that does not match the function of the previously registered device 2000 exists among the functions of the new device 2900, the voice assistant server 3000 may perform operation S350.

도 4에서 S400, S410, S415, S425 및 S430은 순차적으로 수행되는 것으로 설명되었지만, 이에 한정되지 않는다. 예를 들어, 신규 디바이스(2900)의 기능과 기등록된 디바이스(2000)의 기능을 비교(S400)하기 전에, 기등록된 디바이스(2000)의 기능을 조합(S410)하거나 일부를 삭제(S425)하는 동작을 수행하여 미리 DB(미도시)를 구축해 놓을 수 있다. 이 경우, 구축된 DB를 이용하여 신규 디바이스(2900)의 기능을, 기등록된 디바이스(2000)의 기능, 기등록된 디바이스(2000)의 조합된 기능 및 기등록된 디바이스(2000)의삭제 기능과 비교하여(S400, S415, S430), 신규 디바이스(2900)의 기능 중 기등록된 디바이스(2000)의 기능과 일치하지 않는 기능이 존재하는지를 판단할 수 있다.In FIG. 4, S400, S410, S415, S425 and S430 have been described as being sequentially performed, but are not limited thereto. For example, before comparing the functions of the new device 2900 and the functions of the previously registered device 2000 (S400), the functions of the previously registered device 2000 are combined (S410) or a part of them is deleted (S425). It is possible to establish a DB (not shown) in advance by performing the operation. In this case, the function of the new device 2900, the function of the previously registered device 2000, the combined function of the previously registered device 2000, and the deletion function of the previously registered device 2000 using the built DB Compared with (S400, S415, S430), it may be determined whether a function that does not match the function of the previously registered device 2000 exists among the functions of the new device 2900.

도 5(a)는 본 개시의 일 실시예에 따른 기등록된 디바이스의 기능 및 신규 디바이스의 기능을 비교하는 예시를 나타내는 도면이다.5A is a diagram illustrating an example of comparing a function of a previously registered device and a function of a new device according to an embodiment of the present disclosure.

도 5(a)를 참조하면, 보이스 어시스턴트 서버(3000)는 기등록된 에어컨 A의 기능들, 기등록된 제습기 A의 기능들 및 신규 에어컨 B의 기능들을 비교할 수 있다. Referring to FIG. 5A, the voice assistant server 3000 may compare functions of a pre-registered air conditioner A, functions of a pre-registered dehumidifier A, and functions of a new air conditioner B.

예를 들어, 신규 에어컨 B에 의해 지원되는 기능들은, 전원 ON/OFF, 냉방 모드 ON/OFF, 제습 모드 ON/OFF, 온도 설정, 온도 UP/DOWN, 습도 설정, 습도 UP/DOWN, 인공 지능모드 ON/OFF 등을 포함할 수 있다. 또한, 예를 들어, 기등록된 에어컨 A에 의해 지원되는 기능들은, 전원 ON/OFF, 냉방 모드 ON/OFF, 온도 설정, 온도 UP/DOWN 등을 포함할 수 있다. 또한, 예를 들어, 기등록된 제습기 A에 의해 지원되는 기능은, 전원 ON/OFF, 습도 설정, 습도 UP/DOWN 등을 포함할 수 있다.For example, the functions supported by the new air conditioner B are power ON/OFF, cooling mode ON/OFF, dehumidification mode ON/OFF, temperature setting, temperature UP/DOWN, humidity setting, humidity UP/DOWN, artificial intelligence mode. It may include ON/OFF, etc. Further, for example, functions supported by the previously registered air conditioner A may include power ON/OFF, cooling mode ON/OFF, temperature setting, temperature UP/DOWN, and the like. Also, for example, functions supported by the previously registered dehumidifier A may include power ON/OFF, humidity setting, humidity UP/DOWN, and the like.

보이스 어시스턴트 서버(3000)는 신규 에어컨 B의 기능들 중에서, 전원 ON/OFF, 냉방 모드 ON/OFF, 온도 설정, 온도 UP/DOWN, 습도 설정, 습도 UP/DOWN이 기등록된 에어컨 A 및 제습기 A의 기능들과 일치한다고 판단할 수 있다. 보이스 어시스턴트 서버(3000)는 기능들의 일치 여부를 판단하기 위하여, 에어컨 A의 기능들에 관한 발화 데이터 및 제습기 A의 기능들에 관한 발화 데이터를 이용할 수 있다.Among the functions of the new air conditioner B, the voice assistant server 3000 is an air conditioner A and dehumidifier A in which power ON/OFF, cooling mode ON/OFF, temperature setting, temperature UP/DOWN, humidity setting, humidity UP/DOWN are previously registered. It can be judged that it matches the functions of The voice assistant server 3000 may use ignition data related to functions of air conditioner A and ignition data regarding functions of dehumidifier A in order to determine whether the functions match.

보이스 어시스턴트 서버(3000)는 기등록된 에어컨 A에 의해 제공되는 기능들인 전원 ON/OFF, 냉방 모드 ON/OFF, 온도 설정 및 온도 UP/DOWN에 각각 대응되는 발화 데이터들을 획득하고, 기등록된 제습기 A에 의해 제공되는 기능들인 전원 ON/OFF, 습도 설정, 습도 UP/DOWN에 각각 대응되는 발화 데이터들을 획득할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 일치하는 기능들 및 획득된 발화 데이터들을 이용하여 신규 에어컨 B의 액션 데이터를 생성할 수 있다.The voice assistant server 3000 acquires ignition data corresponding to the functions provided by the pre-registered air conditioner A, such as power ON/OFF, cooling mode ON/OFF, temperature setting, and temperature UP/DOWN, and pre-registered dehumidifier It is possible to acquire ignition data corresponding to the functions provided by A, such as power ON/OFF, humidity setting, and humidity UP/DOWN. Also, the voice assistant server 3000 may generate action data of a new air conditioner B by using matching functions and acquired ignition data.

도 5(b)는 본 개시의 일 실시예에 따른 기등록된 디바이스의 기능 세트 및 신규 디바이스의 기능을 비교하는 예시를 나타내는 도면이다.5B is a diagram illustrating an example of comparing a function set of a previously registered device and a function of a new device according to an embodiment of the present disclosure.

도 5(b)를 참조하면, 보이스 어시스턴트 서버(3000)는 기등록된 에어컨 A의 기능 세트인 ‘냉방 모드 ON + 온도 UP’이 신규 에어컨 B의 ‘냉방 모드 ON/OFF’ 및 ‘온도 UP/DOWN’에 매칭됨을 식별할 수 있다.Referring to FIG. 5(b), the voice assistant server 3000 uses'cooling mode ON + temperature UP', which is a function set of pre-registered air conditioner A, and'cooling mode ON/OFF' and'temperature UP/ It can be identified that it matches'DOWN'.

보이스 어시스턴트 서버(3000)는 기등록된 에어컨 A의 기능 세트인 ‘냉방 모드 ON + 온도 UP’에 대응되는 발화 데이터인 ‘온도 높여’를 획득할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 신규 에어컨 B의 기능들인 ‘냉방 모드 ON/OFF’ 및 ‘온도 UP/DOWN’, 및 획득된 발화 데이터인 ‘온도 높여’를 이용하여, ‘냉방 모드 ON’ 기능을 실행한 이후에 ‘온도 UP’ 기능을 실행하기 위한 액션 데이터를 생성할 수 있다.The voice assistant server 3000 may acquire “temperature increase”, which is ignition data corresponding to “cooling mode ON + temperature UP”, which is a function set of pre-registered air conditioner A. In addition, the voice assistant server 3000 uses'cooling mode ON/OFF' and'temperature UP/DOWN', which are functions of the new air conditioner B, and'cooling mode ON' function, using the acquired ignition data'temperature increase'. After executing'Temperature UP', action data for executing the'Temperature UP' function can be created.

도 5(c)는 본 개시의 일 실시예에 따른 기등록된 디바이스의 기능 및 기능 세트의 조합을 신규 디바이스의 기능과 비교하는 예시를 나타내는 도면이다.5C is a diagram illustrating an example of comparing a combination of a function and a function set of a previously registered device with a function of a new device according to an embodiment of the present disclosure.

도 5(c)를 참조하면, 보이스 어시스턴트 서버(3000)는 기등록된 에어컨 A의 기능인 ‘전원 ON’ 및 기등록된 에어컨 A의 기능 세트인 ‘냉방 모드 ON + 온도 DOWN’의 조합이 신규 에어컨 B의 ‘전원 ON/OFF’, ‘냉방 모드 ON/OFF’ 및 ‘온도 UP/DOWN’에 매칭됨을 식별할 수 있다.5(c), the voice assistant server 3000 is a new air conditioner with a combination of'power ON' which is a function of a pre-registered air conditioner A and'cooling mode ON + temperature down' which is a function set of a pre-registered air conditioner A. It can be identified that it matches'Power ON/OFF','Cooling Mode ON/OFF' and'Temperature UP/DOWN' of B.

보이스 어시스턴트 서버(3000)는 기등록된 에어컨 A의 기능인 ‘전원 ON’ 에 대응되는 발화 데이터인 ‘전원 켜’, 및 기등록된 에어컨 A의 기능 세트인 ‘냉방 모드 ON + 온도 DOWN’에 대응되는 발화 데이터인 ‘온도 낮춰’를 획득할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 획득된 발화 데이터들을 편집할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 ‘전원 켜’ 및 ‘온도 낮춰’로부터 ‘에어컨 켜서 온도 낮춰’를 나타내는 발화 데이터를 생성할 수 잇다.The voice assistant server 3000 corresponds to'power on', which is ignition data corresponding to'power ON', which is a function of pre-registered air conditioner A, and'cooling mode ON + temperature down', which is a function set of pre-registered air conditioner A. It is possible to obtain the ignition data'lower temperature'. Also, the voice assistant server 3000 may edit the acquired speech data. For example, the voice assistant server 3000 may generate ignition data representing'turn on the air conditioner to lower the temperature' from'turn on the power' and'lower the temperature'.

또한, 보이스 어시스턴트 신규 에어컨 B의 ‘전원 ON/OFF’, ‘냉방 모드 ON/OFF’ 및 ‘온도 UP/DOWN’, 및 생성된 발화 데이터인 ‘에어컨 켜서 온도 낮춰’를 이용하여, ‘전원 ON’ 기능을 실행한 이후에 ‘냉방 모드 ON’ 기능을 실행하고, 그 이후에 ‘온도 DOWN’ 기능을 실행하기 위한 액션 데이터를 생성할 수 있다.In addition,'Power ON/OFF','Cooling Mode ON/OFF' and'Temperature UP/DOWN' of the Voice Assistant's new air conditioner B, and'Power ON' by using the generated ignition data'Turn on the air conditioner to lower the temperature'. After executing the function, the'cooling mode ON' function is executed, and after that, action data for executing the'temperature down' function can be generated.

도 5(d)는 본 개시의 일 실시예에 따른 기등록된 복수의 디바이스의 기능 및 기능 세트의 조합을 신규 디바이스의 기능과 비교하는 예시를 나타내는 도면이다.FIG. 5D is a diagram illustrating an example of comparing a combination of a function and a function set of a plurality of previously registered devices with a function of a new device according to an embodiment of the present disclosure.

도 5(d)를 참조하면, 보이스 어시스턴트 서버(3000)는 i)기등록된 에어컨 A의 기능인 ‘전원 ON’, ii)기등록된 에어컨 A의 기능 세트인 ‘냉방 모드 ON + 온도 DOWN’, 및 iii)제습 장치 A의 기능 세트인 ‘전원 ON + 습도 DOWN’의 조합이, 신규 에어컨 B의 ‘전원 ON/OFF’, ‘냉방 모드 ON/OFF’, ‘온도 UP/DOWN’, ‘제습 모드 ON/OFF’ 및 ‘습도 UP/DOWN’에 매칭됨을 식별할 수 있다.Referring to FIG. 5(d), the voice assistant server 3000 includes i)'Power ON', which is a function of pre-registered air conditioner A, ii)'Cooling Mode ON + Temperature DOWN', which is a function set of pre-registered air conditioner A, And iii) the combination of'Power ON + Humidity DOWN', which is the function set of dehumidifying device A, is the combination of'Power ON/OFF','Cooling Mode ON/OFF','Temperature UP/DOWN', and'Dehumidification Mode' of the new air conditioner B. It can be identified that it matches'ON/OFF' and'Humidity UP/DOWN'.

보이스 어시스턴트 서버(3000)는 i)기등록된 에어컨 A의 기능인 ‘전원 ON’ 에 대응되는 발화 데이터인 ‘전원 켜’, ii)기등록된 에어컨 A의 기능 세트인 ‘냉방 모드 ON + 온도 DOWN’에 대응되는 발화 데이터인 ‘온도 낮춰’, 및 iii)기등록된 제습 장치 A의 기능 세트인 ‘전원 ON + 습도 DOWN’에 대응되는 발화 데이터인 ‘습도 낮춰’를 획득할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 획득된 발화 데이터들을 편집할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 ‘전원 켜’, ‘온도 낮춰’ 및 ‘습도 낮춰’로부터 ‘에어컨 켜서 온도 낮추고 습도 낮춰’를 나타내는 발화 데이터를 생성할 수 잇다.The voice assistant server 3000 includes i)'Power ON', which is the ignition data corresponding to'Power ON', which is a function of pre-registered air conditioner A, ii)'Cooling Mode ON + Temperature DOWN', which is a function set of pre-registered air conditioner A. 'Temperature lowering', which is ignition data corresponding to and iii)'lower humidity', which is ignition data corresponding to'Power ON + Humidity DOWN', which is a function set of the previously registered dehumidifying device A, may be obtained. Also, the voice assistant server 3000 may edit the acquired speech data. For example, the voice assistant server 3000 can generate ignition data representing'turn on the air conditioner to lower the temperature and lower the humidity' from'turn on the power','lower the temperature', and'lower the humidity'.

또한, 보이스 어시스턴트 신규 에어컨 B의 ‘전원 ON/OFF’, ‘냉방 모드 ON/OFF’, ‘온도 UP/DOWN’, ‘제습 모드 ON/OFF’ 및 ‘습도 UP/DOWN’, 및 생성된 발화 데이터인 ‘에어컨 켜서 온도 낮추고 습도 낮춰’를 이용하여, ‘전원 ON’ 기능을 실행한 이후에 ‘냉방 모드 ON’ 기능을 실행하고 ‘온도 DOWN’ 기능을 실행하고 ‘제습 모드 ON’ 기능을 실행하고 ‘습도 DOWN’ 기능을 실행하기 위한 액션 데이터를 생성할 수 있다.In addition,'Power ON/OFF','Cooling Mode ON/OFF','Temperature UP/DOWN','Dehumidification Mode ON/OFF' and'Humidity UP/DOWN', and generated ignition data of Voice Assistant's new air conditioner B After executing the'Power ON' function, execute the'Cooling Mode ON' function, execute the'Temperature DOWN' function, and execute the'Dehumidification Mode ON' function by using'Turn on the air conditioner to lower the temperature and lower the humidity'. Action data can be created to execute the'Humidity DOWN' function.

도 5(e)는 본 개시의 일 실시예에 따른 기등록된 디바이스의 기능들 중 일부를 삭제한 이후에 남은 기능을 신규 디바이스의 기능과 비교하는 예시를 나타내는 도면이다.5(e) is a diagram illustrating an example of comparing a function remaining after deleting some of functions of a previously registered device with a function of a new device according to an embodiment of the present disclosure.

도 5(e)를 참조하면, 보이스 어시스턴트 서버(3000)는 기등록된 에어컨 A의 기능 세트인 ‘온도 26도 설정 + 온도 확인 + 인공 지능 모드 ON’에서 ‘온도 26도 설정 + 온도 확인’을 삭제하고 남은 기능인 ‘인공 지능 모드 ON’를 획득할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 i)남은 기능인 ‘인공 지능 모드 ON’ 및 ii)기등록된 에어컨 A의 기능인 ‘전원 ON’ 의 조합이, 신규 에어컨 B의 ‘전원 ON/OFF’ 및 ‘인공 지능 모드 ON/OFF’에 매칭됨을 식별할 수 있다.Referring to FIG. 5(e), the voice assistant server 3000 performs'Temperature 26° Set + Temperature Check' in'Temperature 26° Setting + Temperature Check + Artificial Intelligence Mode ON', which is a function set of the registered air conditioner A. You can acquire the leftover function,'Artificial Intelligence Mode ON'. In addition, the voice assistant server 3000 is a combination of i)'Artificial Intelligence Mode ON' which is the remaining function and ii)'Power ON' which is the function of the previously registered air conditioner A, and'Power ON/OFF' and'Artificial Intelligence Mode ON' of the new air conditioner B. It can be identified that it matches'intelligence mode ON/OFF'.

보이스 어시스턴트 서버(3000)는 i)기등록된 에어컨 A의 기능인 ‘전원 ON’ 에 대응되는 발화 데이터인 ‘전원 켜’를 획득할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는, 기등록된 에어컨 A의 기능 세트인 ‘온도 26도 설정 + 온도 확인 + 인공 지능 모드 ON’에 대응되는 발화 데이터인 ‘온도 26도로 인공 지능 기능을 켜줘’로부터, 남은 기능인 ‘인공 지능 모드 ON ‘에 대응되는 발화 데이터인 ‘인공 지능 기능 켜줘’를 추출할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 ‘전원 켜’ 및 ‘인공 지능 기능 켜줘’로부터 ‘전원 켜고 인공 지능 기능을 켜줘’를 나타내는 발화 데이터를 생성할 수 있다.The voice assistant server 3000 may i) acquire “power on”, which is ignition data corresponding to “power ON,” which is a function of the previously registered air conditioner A. In addition, the voice assistant server 3000 is from'Turn on the artificial intelligence function at a temperature of 26 degrees', which is ignition data corresponding to'Temperature 26 degrees setting + temperature check + artificial intelligence mode ON', which is a function set of pre-registered air conditioner A. , It is possible to extract the utterance data'Turn on artificial intelligence function' corresponding to the remaining function'Artificial Intelligence Mode ON'. In addition, the voice assistant server 3000 may generate utterance data indicating'turn on the power and turn on the artificial intelligence function' from'turn on the power' and'turn on the artificial intelligence function'.

또한, 보이스 어시스턴트 서버(3000)는 신규 에어컨 B의 ‘전원 ON/OFF’ 및 ‘인공 지능 모드 ON/OFF’, 및 생성된 발화 데이터인 ‘전원 켜서 인공 지능 기능을 켜줘’를 이용하여, ‘전원 ON’ 기능을 실행한 이후에 ‘인공 지능 모드 ON’ 기능을 실행하기 위한 액션 데이터를 생성할 수 있다.In addition, the voice assistant server 3000 uses'power ON/OFF' and'artificial intelligence mode ON/OFF' of the new air conditioner B, and'power on and turn on the artificial intelligence function', which is the generated ignition data. After executing the'ON' function, action data for executing the'Artificial Intelligence Mode ON' function can be created.

도 6은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버가 신규 디바이스의 기능들 중에서 기등록된 디바이스의 기능과 상이한 기능에 관련된 발화 데이터 및 액션 데이터를 생성하는 방법의 흐름도이다.6 is a flowchart of a method for generating, by a voice assistant server, utterance data and action data related to a function different from a function of a previously registered device among functions of a new device according to an embodiment of the present disclosure.

동작 S600에서 보이스 어시스턴트 서버(3000)는 NLG 모델을 이용하여, 추가 기능의 등록 및 발화 데이터의 생성 또는 편집을 위한 질의를 출력할 수 있다. 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능을 등록하고 발화 데이터의 생성을 위한 GUI를 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 개발자의 디바이스(미도시)는 신규 디바이스의 등록을 위한 소정의 소프트웨어 개발 킷(SDK: Software Development Kit)을 설치할 수 있으며, 설치된 소프트웨어 개발 킷을 통하여 보이스 어시스턴트 서버(3000)로부터 GUI를 수신할 수 있다. In operation S600, the voice assistant server 3000 may output a query for registering an additional function and generating or editing speech data using the NLG model. The voice assistant server 3000 may register a function of the new device 2900 and provide a GUI for generating speech data to the user's device 2000 or the developer's device (not shown). The developer's device (not shown) may install a software development kit (SDK) for registering a new device, and receive a GUI from the voice assistant server 3000 through the installed software development kit. .

보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능을 등록하고 발화 데이터의 생성을 안내하기 위한 안내 텍스트 또는 안내 음성 데이터를 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 보이스 어시스턴트 서버(3000)는 기능 등록 및 발화 데이터 생성을 위해 훈련된 NLG 모델을 이용하여, 추가 기능의 등록 및 발화 데이터의 생성을 위한 질의를 생성할 수 있다.The voice assistant server 3000 may register a function of the new device 2900 and provide guidance text or guidance voice data for guiding the generation of speech data to the user's device 2000 or the developer's device (not shown). have. The voice assistant server 3000 may generate a query for registering additional functions and generating speech data by using an NLG model trained for function registration and speech data generation.

또한, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능들 중에서 기등록된 디바이스(2000)의 기능과 상이한 기능들의 목록을 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 보이스 어시스턴트 서버(3000)는 상이한 기능들 중 적어도 일부에 관련된 추천 발화 데이터를 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다.In addition, the voice assistant server 3000 provides a list of functions different from those of the previously registered device 2000 among the functions of the new device 2900 to the user's device 2000 or the developer's device (not shown). I can. The voice assistant server 3000 may provide recommended speech data related to at least some of different functions to the user's device 2000 or the developer's device (not shown).

동작 S610에서 보이스 어시스턴트 서버(3000)는 NLU 모델을 이용하여 질의에 대한 응답을 해석할 수 있다. 보이스 어시스턴트 서버(3000)는 질의에 대한 사용자의 응답을 사용자의 디바이스(2000)로부터 수신하거나 질의에 대한 개발자의 응답을 개발자의 디바이스(미도시)로부터 수신할 수 있다. 보이스 어시스턴트 서버(3000)는 기능 등록 및 발화 데이터 생성을 위해 훈련된 NLU 모델을 이용하여, 사용자의 응답 또는 개발자의 응답을 해석할 수 있다.In operation S610, the voice assistant server 3000 may analyze a response to the query using the NLU model. The voice assistant server 3000 may receive the user's response to the query from the user's device 2000 or the developer's response to the query from the developer's device (not shown). The voice assistant server 3000 may interpret a user's response or a developer's response by using an NLU model trained for function registration and generation of speech data.

또한, 보이스 어시스턴트 서버(3000)는 사용자의 디바이스(2000)에게 제공된 GUI를 통한 사용자의 응답 입력을 사용자의 디바이스(2000)로부터 수신하거나, 개발자의 디바이스(미도시)에게 제공된 GUI를 통한 개발자의 응답 입력을 개발자의 디바이스(미도시)로부터 수신할 수 있다.In addition, the voice assistant server 3000 receives the user's response input through the GUI provided to the user's device 2000 from the user's device 2000, or the developer's response through the GUI provided to the developer's device (not shown). Input may be received from the developer's device (not shown).

동작 S620에서 보이스 어시스턴트 서버(3000)는 해석된 응답에 기초하여, 신규 디바이스(2900)의 기능들에 관련된 발화 데이터를 생성할 수 있다. 보이스 어시스턴트 서버(3000)는 해석된 사용자의 응답 또는 해석된 개발자의 응답을 이용하여 신규 디바이스(2900)의 기능들에 관련된 발화 데이터를 생성하고, 생성된 발화 데이터를 추천할 수 있다. 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능들 중 일부를 선택하고, 선택된 일부 기능 각각에 관련된 발화 데이터들을 생성할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)의 기능들 중 일부를 선택하고, 선택된 일부 기능들의 조합에 관련된 발화 데이터를 생성할 수 있다.In operation S620, the voice assistant server 3000 may generate speech data related to functions of the new device 2900 based on the analyzed response. The voice assistant server 3000 may generate speech data related to functions of the new device 2900 by using an interpreted user's response or an interpreted developer's response, and recommend the generated speech data. The voice assistant server 3000 may select some of the functions of the new device 2900 and generate speech data related to each of the selected partial functions. Also, the voice assistant server 3000 may select some of the functions of the new device 2900 and generate speech data related to a combination of the selected partial functions.

보이스 어시스턴트 서버(3000)는, 신규 디바이스(2900)의 기능들의 식별 값 및 속성에 기초하여, 발화 데이터 생성을 위한 NLG 모델을 이용하여 신규 디바이스(2900)의 기능에 관련된 발화 데이터를 생성할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 신규 디바이스의 기능들의 식별 값 및 속성을 나타내는 데이터를, 발화 데이터 생성을 위한 NLG 모델에 입력하고, NLG 모델로부터 출력되는 발화 데이터를 획득할 수 있으나, 이에 제한되지 않는다.The voice assistant server 3000 may generate speech data related to the function of the new device 2900 using an NLG model for generating speech data, based on the identification values and attributes of the functions of the new device 2900 . For example, the voice assistant server 3000 may input data representing identification values and attributes of functions of a new device into an NLG model for generating speech data, and obtain speech data output from the NLG model. Not limited.

또한, 보이스 어시스턴트 서버(3000)는 GUI를 통해 입력된 사용자의 응답 입력 및 GUI를 통해 입력된 개발자의 응답 입력에 기초하여 생성된 발화 데이터 중 적어도 일부를 선택할 수도 있다. 또한, 보이스 어시스턴트 서버(3000)는 생성된 발화 데이터와 동일한 의미를 가지면서 상이한 표현을 가지는 유사 발화 데이터를 생성할 수 있다.In addition, the voice assistant server 3000 may select at least some of the generated speech data based on the user's response input input through the GUI and the developer's response input input through the GUI. In addition, the voice assistant server 3000 may generate similar speech data having the same meaning as the generated speech data and different expressions.

동작 S630에서 보이스 어시스턴트 서버(3000)는 생성된 발화 데이터를 이용하여 액션 데이터를 생성할 수 있다. 보이스 어시스턴트 서버(3000)는 생성된 발화 데이터에 관련된 신규 디바이스(2900)의 기능들을 식별하고, 식별된 기능들의 실행 순서를 결정함으로써, 생성된 발화 데이터에 대응되는 액션 데이터를 생성할 수 있다. 생성된 액션 데이터는 발화 데이터 및 유사 발화 데이터에 매칭될 수 있다.In operation S630, the voice assistant server 3000 may generate action data using the generated speech data. The voice assistant server 3000 may generate action data corresponding to the generated speech data by identifying functions of the new device 2900 related to the generated speech data and determining an execution order of the identified functions. The generated action data may be matched with speech data and similar speech data.

도 7(a)는 본 개시의 일 실시예에 따른 신규 디바이스의 기능에 관련된 발화 데이터 및 액션 데이터를 생성하기 위해 보이스 어시스턴트 서버로부터 출력되는 질의의 예시를 나타내는 도면이다.7A is a diagram illustrating an example of a query output from a voice assistant server to generate speech data and action data related to a function of a new device according to an embodiment of the present disclosure.

도 7(a)를 참조하면, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)인 에어컨의 신규 기능에 관련된 발화문을 입력 받기 위한 질의를 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 예를 들어, “자동 건조 기능에 관련된 발화문을 입력해 주세요.”라는 텍스트 또는 질의 음성이 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)로부터 출력될 수 있다.Referring to FIG. 7(a), the voice assistant server 3000 requests a query for inputting an utterance related to a new function of an air conditioner, a new device 2900, of a user's device 2000 or a developer's device (not shown). Can be provided to. For example, a text or a query voice saying “Please input a speech related to the automatic drying function” may be output from the user's device 2000 or the developer's device (not shown).

또한, 보이스 어시스턴트 서버(3000)는, 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)로부터 입력된 “에어컨 냄새를 없애줘.”라는 발화문을, 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)로부터 수신할 수 있다.In addition, the voice assistant server 3000, the user's device 2000 or the developer's device inputs the utterance "Remove the smell of air conditioner" input from the user's device 2000 or the developer's device (not shown). It can be received from (not shown).

보이스 어시스턴트 서버(3000)는 “에어컨 냄새를 없애줘.”를 “에어컨 냄새를 제거해줘.”로 수정할 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 수정된 발화문인 “에어컨 냄새를 제거해줘.”에 대응되는 액션 데이터인 “현재 동작 OFF + 건조 기능 ON”을 생성할 수 있다.The voice assistant server 3000 may modify "Remove the smell of air conditioner." to "Remove the smell of air conditioner." In addition, the voice assistant server 3000 may generate “current operation OFF + drying function ON”, which is action data corresponding to “remove air conditioner odor”, which is a modified speech statement.

도 7(b)는 본 개시의 일 실시예에 따른 신규 디바이스의 기능에 관련된 발화 데이터 및 액션 데이터를 생성하기 위해 발화문을 추천하는 질의가 출력되는 예시를 나타내는 도면이다.7B is a diagram illustrating an example in which a query recommending a speech sentence is output to generate speech data and action data related to a function of a new device according to an embodiment of the present disclosure.

도 7(b)를 참조하면, 보이스 어시스턴트 서버(3000)는 신규 디바이스(2900)인 에어컨의 신규 기능을 알리기 위한 텍스트 또는 음성을 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 “자동 건조 기능이 신규의 기능입니다.”라는 텍스트 또는 음성이 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)로부터 출력될 수 있다. 또한, 보이스 어시스턴트 서버(3000)는 신규 기능인 자동 건조 기능과 관련된 추천 발화문을 생성하고, 추천 발화문을 나타내는 텍스트 또는 음성을 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 예를 들어, 보이스 어시스턴트 서버(3000)는 “”에어컨을 끌 때, 건조 기능을 실행해.”라는 발화문을 등록할까요?”라는 질의가 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)로부터 출력될 수 있다.Referring to FIG. 7(b), the voice assistant server 3000 provides text or voice for notifying the new function of the air conditioner, which is the new device 2900, to the user's device 2000 or the developer's device (not shown). I can. For example, the voice assistant server 3000 may output a text or voice saying “The automatic drying function is a new function” from the user's device 2000 or the developer's device (not shown). In addition, the voice assistant server 3000 may generate a recommended utterance related to the automatic drying function, which is a new function, and provide text or voice representing the recommended utterance to the user's device 2000 or the developer's device (not shown). have. For example, the voice assistant server 3000 asks, "Do you want to register an utterance saying "When you turn off the air conditioner, execute the drying function"?" is the user's device 2000 or the developer's device (not shown). Can be output from

또한, 보이스 어시스턴트 서버(3000)는, 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)로부터 추천 발화문을 선택하는 입력을 수신할 수 있다.In addition, the voice assistant server 3000 may receive an input for selecting a recommended speech from the user's device 2000 or the developer's device (not shown).

보이스 어시스턴트 서버(3000)는 추천 발화문인 “”에어컨을 끌 때, 건조 기능을 실행해.”라는 발화문을 등록할까요?”에 대응되는 액션 데이터인 “전원 OFF 입력의 수신을 확인 + 건조 기능 ON + 에어컨 전원 OFF”을 생성할 수 있다.The voice assistant server 3000 checks the reception of the power OFF input + drying function, which is action data corresponding to the recommended utterance, "Do you want to register the utterance ""Execute the drying function when turning off the air conditioner." + Air conditioner power off” can be generated.

도 8은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버(3000)가 발화 데이터를 확장하는 방법의 흐름도이다.8 is a flowchart of a method for expanding speech data by the voice assistant server 3000 according to an embodiment of the present disclosure.

동작 S800에서 보이스 어시스턴트 서버(3000)는 생성된 발화 데이터를 인공 지능 모델에 입력함으로써, 생성된 발화 데이터에 관련된 유사 발화 데이터를 획득할 수 있다. 보이스 어시스턴트 서버(3000)는 발화 데이터와 유사한 발화 데이터를 생성하기 위해 훈련된 인공 지능 모델에 생성된 발화 데이터를 입력함으로써, 인공 지능 모델로부터 출력되는 유사 발화 데이터를 획득할 수 있다. 인공 지능 모델은, 예를 들어, 디바이스를 제어하기 위한 발화 문장 및 유사 발화 문장의 세트를 학습 데이터로 이용하여 학습된 모델일 수 있다.In operation S800, the voice assistant server 3000 may acquire similar speech data related to the generated speech data by inputting the generated speech data into the artificial intelligence model. The voice assistant server 3000 may acquire similar utterance data output from the artificial intelligence model by inputting the generated utterance data to the trained artificial intelligence model to generate utterance data similar to the utterance data. The artificial intelligence model may be, for example, a model trained by using a set of spoken sentences and similar spoken sentences for controlling a device as training data.

예를 들어, 도 9(a)에서와 같이, 발화문인 “에어컨 냄새 제거해”가 인공 지능 모델로 입력되면, “에어컨 냄새를 없애줘”, “에어컨 냄새가 나네”, “곰팡이 냄새가 나네”, “곰팡이 냄새를 제거해” 등과 같은 유사 발화문들이 인공 지능 모델로부터 출력될 수 있다. 또한, 인공 지능 모델에 입력된 발화문인 “에어컨 냄새 제거해”가 대표 발화문으로 설정될 수 있다. 대표 발화문은, 예를 들어, 사용자의 사용 빈도수, 문법의 정확도 등을 고려하여 설정될 수 있으나, 이에 제한되지 않는다.For example, as shown in Fig. 9(a), when the ignition statement "Remove the smell of air conditioner" is input as an artificial intelligence model, "Remove the smell of air conditioner", "I smell the air conditioner", "I smell moldy", Similar utterances, such as "remove the smell of mold," can be output from the artificial intelligence model. In addition, “remove air conditioner odor”, which is an utterance sentence input to the artificial intelligence model, may be set as the representative utterance sentence. The representative speech sentence may be set in consideration of, for example, the frequency of use of the user and the accuracy of grammar, but is not limited thereto.

또한, 예를 들어, 도 10(a)에서와 같이, 발화문인 “에어컨 끌 때, 건조 기능을 실행해”가 인공 지능 모델로 입력되면, “에어컨 끄면, 건조 기능 실행해”, “에어컨 끌 때, 냄새 안나게 해줘”, “에어컨 끄면, 냄새 안나게 해줘”, 등과 같은 유사 발화문들이 인공 지능 모델로부터 출력될 수 있다. 또한, 에어컨 끄면, 건조 기능을 실행해”가 대표 발화문으로 결정될 수 있다. 또한, 인공 지능 모델로부터 출력된 유사 발화문들 중 하나인 “에어컨 끄면, 건조 기능 실행해”가 대표 발화문으로 설정될 수 있다. 대표 발화문은, 예를 들어, 사용자의 사용 빈도수, 문법의 정확도 등을 고려하여 설정될 수 있으나, 이에 제한되지 않는다.In addition, for example, as shown in Fig. 10(a), when the ignition statement “when the air conditioner is turned off, execute the drying function” is input as the artificial intelligence model, “when the air conditioner is turned off, the drying function is executed”, “when the air conditioner is turned off, Similar utterances such as, "Smell no smell", "If the air conditioner is turned off, no smell", etc. can be output from the artificial intelligence model. Also, when the air conditioner is turned off, the drying function is executed” can be determined as the representative fire statement. In addition, one of the similar speech statements output from the artificial intelligence model, “If the air conditioner is turned off, execute the drying function” may be set as the representative speech statement. The representative speech sentence may be set in consideration of, for example, the frequency of use of the user and the accuracy of grammar, but is not limited thereto.

동작 S810에서 보이스 어시스턴트 서버(3000)는 액션 데이터를 발화 데이터 및 유사 발화 데이터에 매칭할 수 있다. 보이스 어시스턴트 서버(3000)는 인공 지능 모델에 입력된 발화 데이터에 대응되는 액션 데이터를, 인공 지능 모델로부터 출력된 유사 발화 데이터에 매칭할 수 있다.In operation S810, the voice assistant server 3000 may match action data with utterance data and similar utterance data. The voice assistant server 3000 may match action data corresponding to the speech data input to the artificial intelligence model with the pseudo speech data output from the artificial intelligence model.

예를 들어, 도 9(b)에서와 같이, 대표 발화문인 “에어컨 냄새 제거해”, 및 유사 발화문들인 “에어컨 냄새를 없애줘”, “에어컨 냄새가 나네”, “곰팡이 냄새가 나네”, “곰팡이 냄새를 제거해” 가, 액션 데이터인 “현재 동작 OFF -> 건조 기능 ON”에 매핑될 수 있다.For example, as shown in FIG. 9(b), representative fire statements “Remove the smell of air conditioner”, and similar fire statements “Remove the smell of air conditioner”, “I smell air conditioner”, “I smell mold”, “ Remove mold odor” can be mapped to action data “Current operation OFF -> Dry function ON”.

또한, 예를 들어, 도 10(b)에서와 같이, 대표 발화문인 “에어컨 끄면, 건조 기능 실행해” 및 유사 발화문들인 “에어컨 냄새 제거해”, “에어컨 끌 때, 냄새 안나게 해줘”, “에어컨 끄면, 냄새 안나게 해줘” 가, 액션 데이터인 “전원 OFF 입력 수신을 확인 -> 건조 기능 ON -> 에어컨 전원 OFF”에 매핑될 수 있다.In addition, for example, as shown in FIG. 10(b), the representative fire statements "If you turn off the air conditioner, perform the drying function" and similar fire statements, "Remove the smell of air conditioner", "When you turn off the air conditioner, please do not make the smell", "Air conditioner If you turn it off, let me not smell it” can be mapped to the action data “Check reception of power OFF input -> Dry function ON -> Air conditioner power OFF”.

도 11(a)는 본 개시의 일 실시예에 따른 발화 데이터의 예시를 나타내는 도면이다.11A is a diagram illustrating an example of speech data according to an embodiment of the present disclosure.

도 11(a)를 참조하면, 발화 데이터는 텍스트 형태의 발화문일 수 있다. 예를 들어, 대표 발화문인 “TV 켜줘”, 및 유사 발화문들인 “TV 켜주세요”. “TV 켜봐”, “TV 틀어줘”가 발화 데이터일 수 있다.Referring to FIG. 11A, speech data may be speech in text form. For example, the representative speech “Turn on TV”, and similar speech phrases “Turn on TV”. “Turn on the TV” or “Turn on the TV” may be speech data.

도 11(b)는 본 개시의 일 실시예에 따른 발화 데이터의 예시를 나타내는 도면이다.11(b) is a diagram illustrating an example of speech data according to an embodiment of the present disclosure.

도 11(b)를 참조하면, 발화 데이터는 발화 파라미터 및 발화문을 포함할 수 있다. 발화 파라미터는, NLU 모델의 출력 값으로서, 인텐트 및 파라미터를 포함할 수 있다. 예를 들어, 발화 데이터에 포함되는 발화 파라미터는, 인텐트인 “전원 온”과 파라미터인 “TV”를 포함할 수 있다. 또한, 예를 들어, 발화 데이터에 포함되는 발화문은 “TV 켜줘”, “TV 켜주세요”. “TV 켜봐”, “TV 틀어줘”와 같은 텍스트를 포함할 수 있다. 도 11(b)에서는 발화 데이터가 발화 파라미터 및 발화문을 포함하는 것으로 설명하였지만, 이에 제한되지 않으며, 발화 데이터는 발화 파라미터들만을 포함할 수도 있다.Referring to FIG. 11B, speech data may include speech parameters and speech sentences. The utterance parameter is an output value of the NLU model and may include an intent and a parameter. For example, the utterance parameter included in the utterance data may include an intent “power on” and a parameter “TV”. Also, for example, the utterances included in the utterance data are “Turn on the TV” or “Turn on the TV”. You can include text such as “Turn on TV” or “Turn on TV”. In FIG. 11B, it has been described that the speech data includes a speech parameter and a speech sentence, but the present invention is not limited thereto, and the speech data may include only speech parameters.

도 13은 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버의 블록도이다.13 is a block diagram of a voice assistant server according to an embodiment of the present disclosure.

도 13을 참조하면, 보이스 어시스턴트 서버(3000)는 통신 인터페이스(3100), 프로세서(3200) 및 저장부(3300)를 포함하며, 저장부(3300)는 제1 보이스 어시스턴트 모델(3310), 적어도 하나의 제2 보이스 어시스턴트 모델(3320), SDK 인터페이스 모듈(3330) 및 DB(3340)를 포함할 수 있다. Referring to FIG. 13, the voice assistant server 3000 includes a communication interface 3100, a processor 3200, and a storage unit 3300, and the storage unit 3300 includes a first voice assistant model 3310, at least one The second voice assistant model 3320, SDK interface module 3330, and DB 3340 may be included.

통신 인터페이스(3100)는, 클라이언트 디바이스(1000), 디바이스(2000) 및 IoT 클라우드 서버(4000)와 통신을 수행한다. 통신 인터페이스(3100)는 클라이언트 디바이스(1000), 디바이스(2000) 및 IoT 클라우드 서버(4000)와 통신을 위한 하나 이상의 구성요소를 포함할 수 있다.The communication interface 3100 communicates with the client device 1000, the device 2000, and the IoT cloud server 4000. The communication interface 3100 may include one or more components for communication with the client device 1000, the device 2000, and the IoT cloud server 4000.

프로세서(3200)는 통상적으로 보이스 어시스턴트 서버(3000)의 전반적인 동작을 제어한다. 예를 들어, 프로세서(3200)는, 저장부(3300)에 저장된 프로그램들을 실행함으로써, 본 명세서에서의 보이스 어시스턴트 서버(3000)의 기능을 제어할 수 있다.The processor 3200 typically controls the overall operation of the voice assistant server 3000. For example, the processor 3200 may control a function of the voice assistant server 3000 in the present specification by executing programs stored in the storage unit 3300.

저장부(3300)는 프로세서(3200)의 처리 및 제어를 위한 프로그램을 저장할 수 있고, 신규 디바이스(2900)의 기능과 관련된 데이터를 저장할 수 있다. 저장부(3300)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The storage unit 3300 may store a program for processing and control of the processor 3200 and may store data related to functions of the new device 2900. The storage unit 3300 includes a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg, SD or XD memory, etc.), RAM (Random Access Memory) SRAM (Static Random Access Memory), ROM (ROM, Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic It may include at least one type of storage medium among a disk and an optical disk.

저장부(3300)는 에 저장된 프로그램들은 그 기능에 따라 복수 개의 모듈들로 분류할 수 있는데, 예를 들어, 제1 보이스 어시스턴트 모델(3310), 적어도 하나의 제2 보이스 어시스턴트 모델(3320) 및 SDK 인터페이스 모듈(3330) 등으로 분류될 수 있다.The programs stored in the storage unit 3300 can be classified into a plurality of modules according to their functions, for example, a first voice assistant model 3310, at least one second voice assistant model 3320, and an SDK It may be classified as an interface module 3330 or the like.

제1 보이스 어시스턴트 모델(3310)은 사용자 음성 입력을 분석하여 사용자 의도와 관련된 타겟 디바이스를 결정하는 모델이다. 제1 보이스 어시스턴트 모델(3310)은 ASR (Automatic Speech Recognition) 모델(3311), 제1 NLU 모델(3312), 제1 NLG 모델(3313), 디바이스 판단 모듈(3314), 기능 비교 모듈(3315), 발화 데이터 획득 모듈(3316), 액션 데이터 생성 모듈(3317) 및 모델 업데이터(3318)를 포함할 수 있다. The first voice assistant model 3310 is a model that determines a target device related to user intention by analyzing a user's voice input. The first voice assistant model 3310 is an ASR (Automatic Speech Recognition) model 3311, a first NLU model 3312, a first NLG model 3313, a device determination module 3314, a function comparison module 3315, A speech data acquisition module 3316, an action data generation module 3317, and a model updater 3318 may be included.

ASR 모델(3311)은 ASR을 수행함으로써, 음성 신호를 텍스트로 변환한다. ASR 모델(3311)은 음향 모델(acoustic model; AM) 또는 언어 모델(language model; LM) 등 기 정의된 모델을 이용하여 음성 신호를 컴퓨터로 판독 가능한 텍스트로 변환하는 ASR을 수행할 수 있다. 클라이언트 디바이스(110)로부터 노이즈가 제거되지 않은 음향 신호가 수신되는 경우에, ASR 모델(3311)은 수신된 음향 신호에서 노이즈를 제거하여 음성 신호를 획득하고, 음성 신호에 대하여 ASR을 수행할 수 있다.The ASR model 3311 converts the speech signal into text by performing ASR. The ASR model 3311 may perform ASR for converting a speech signal into a computer-readable text using a predefined model such as an acoustic model (AM) or a language model (LM). When an acoustic signal from which noise is not removed from the client device 110 is received, the ASR model 3311 may obtain a voice signal by removing noise from the received acoustic signal, and perform ASR on the voice signal. .

제1 NLU 모델(3312)은 텍스트를 분석하고, 분석 결과에 기초하여 사용자의 의도에 관련된 제1 인텐트를 결정한다. 제1 NLU 모델(3312)은, 텍스트를 해석하여 텍스트에 대응하는 제1 인텐트를 획득하도록 학습된 모델일 수 있다. 인텐트는, 텍스트에 포함된 사용자의 발화 의도를 나타내는 정보일 수 있다. The first NLU model 3312 analyzes the text and determines a first intent related to the user's intention based on the analysis result. The first NLU model 3312 may be a model trained to obtain a first intent corresponding to the text by analyzing the text. The intent may be information indicating a user's speech intention included in the text.

디바이스 판단 모델(3314)은 제1 NLU 모델(3312)을 이용하여 문법적 분석(syntactic analyze) 또는 의미적 분석(semantic analyze)을 수행함으로써, 변환된 텍스트로부터 사용자의 제1 인텐트(intent)를 결정할 수 있다. 일 실시예에서, 디바이스 판단 모델(3314)은 제1 NLU 모델(3312)을 이용하여, 변환된 텍스트를 형태소, 단어(word), 또는 구(phrase)의 단위로 파싱(parse)하고, 파싱된 형태소, 단어, 또는 구의 언어적 특징(예: 문법적 요소)을 이용하여 파싱된 텍스트로부터 추출된 단어의 의미를 추론할 수 있다. 디바이스 판단 모델(3314)은, 추론된 단어의 의미를 제1 NLU 모델(3312)에서 제공되는 기 정의된 인텐트들과 비교함으로써, 추론된 단어의 의미에 대응되는 제1 인텐트를 결정할 수 있다. 디바이스 판단 모델(3314)은 제1 인텐트에 기초하여 타겟 디바이스의 타입(type)을 결정할 수 있다. 일 실시예에서, 디바이스 판단 모델(3314)은 제1 NLU 모델(3312)을 이용하여 획득한 제1 인텐트를 활용하여, 타겟 디바이스의 타입을 결정할 수 있다. 디바이스 판단 모델(3314)은 파싱된 텍스트 및 타겟 디바이스 정보를 제2 보이스 어시스턴트 모델(3320)에게 제공한다. 일 실시예에서, 디바이스 판단 모델(3314)은 결정된 타겟 디바이스의 식별 정보(예: 디바이스 id)를 파싱된 텍스트와 함께 제2 보이스 어시스턴트 모델(3320)에게 제공할 수 있다.The device determination model 3314 determines a user's first intent from the converted text by performing a syntactic analysis or semantic analysis using the first NLU model 3312. I can. In one embodiment, the device determination model 3314 uses the first NLU model 3312 to parse the converted text in units of morphemes, words, or phrases, and the parsed The meaning of the word extracted from the parsed text can be inferred by using the linguistic features of the morpheme, word, or phrase (eg, grammatical element). The device determination model 3314 may determine a first intent corresponding to the meaning of the inferred word by comparing the meaning of the inferred word with predefined intents provided by the first NLU model 3312 . The device determination model 3314 may determine the type of the target device based on the first intent. In an embodiment, the device determination model 3314 may determine the type of the target device by using the first intent obtained using the first NLU model 3312. The device determination model 3314 provides the parsed text and target device information to the second voice assistant model 3320. In an embodiment, the device determination model 3314 may provide the determined identification information (eg, device ID) of the target device to the second voice assistant model 3320 together with the parsed text.

제1 NLG 모델(3313)은 신규 디바이스(2900)의 기능을 등록하고 발화 데이터의 생성 또는 편집을 위한 질의 메시지를 생성할 수 있다.The first NLG model 3313 may register a function of the new device 2900 and generate a query message for generating or editing speech data.

기능 비교 모듈(3315)은 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능을 비교할 수 있다. 기능 비교 모듈(3315)은 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 동일 또는 유사한지를 판단할 수 있다. 기능 비교 모듈(3315)은 신규 디바이스(2900)의 기능들 중에서 기등록된 디바이스(2000)의 기능과 동일 또는 유사한 기능을 식별할 수 있다.The function comparison module 3315 may compare a function of the previously registered device 2000 with a function of the new device 2900. The function comparison module 3315 may determine whether a function of the previously registered device 2000 and a function of the new device 2900 are the same or similar. The function comparison module 3315 may identify a function identical or similar to a function of the previously registered device 2000 among functions of the new device 2900.

기능 비교 모듈(3315)은 신규 디바이스(2900)의 명세 정보로부터 신규 디바이스(2900)에 의해 지원되는 기능을 나타내는 명칭을 식별하고, 식별된 명칭이 기등록된 디바이스(2000)에 의해 지원되는 기능의 명칭과 동일 또는 유사한 지를 판단할 수 있다. 이 경우, DB(3340)는 소정 기능을 나타내는 명칭 및 유사어들에 관한 정보를 미리 저장할 수 있으며, 저장된 유사어 정보에 기초하여 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 동일 또는 유사한지를 판단할 수 있다.The function comparison module 3315 identifies a name representing a function supported by the new device 2900 from the specification information of the new device 2900, and the identified name is of the function supported by the previously registered device 2000. It can be determined whether it is the same or similar to the name. In this case, the DB 3340 may previously store information on names and similar words representing a predetermined function, and the functions of the previously registered device 2000 and the new device 2900 are the same based on the stored similar word information. Or you can judge whether they are similar.

또한, 기능 비교 모듈(3315)은 DB(3340)에 저장된 발화 데이터를 참고하여 기능의 동일 유사 여부를 판단할 수 있다. 기능 비교 모듈(3315)은 기등록된 디바이스(2000)의 기능과 관련된 발화 데이터를 이용하여, 신규 디바이스(2900)의 기능이 기등록된 디바이스(2000)의 기능과 동일 또는 유사한지를 판단할 수 있다. 이 경우, 기능 비교 모듈(3315)은 제1 NLU 모델을 이용하여 발화 데이터를 해석하고, 발화 데이터 내에 포함된 단어들의 의미에 기초하여 신규 디바이스(2900)의 기능이 기등록된 디바이스(2000)의 기능과 동일 또는 유사한지를 판단할 수 있다. In addition, the function comparison module 3315 may determine whether or not the functions are identical or similar by referring to speech data stored in the DB 3340. The function comparison module 3315 may determine whether the function of the new device 2900 is the same as or similar to the function of the previously registered device 2000 by using speech data related to the function of the previously registered device 2000. . In this case, the function comparison module 3315 analyzes the speech data using the first NLU model, and the function of the new device 2900 is registered in the device 2000 based on the meaning of words included in the speech data. It can be determined whether it is the same or similar to the function.

기능 비교 모듈(3315)은 기등록된 디바이스(2000)의 단일 기능과 신규 디바이스(2900)의 단일 기능이 동일 또는 유사한 지를 판단할 수 있다. 기능 비교 모듈(3315)은 기등록된 디바이스(2000)의 기능 세트와 신규 디바이스(2900)의 기능 세트가 동일 또는 유사한 지를 판단할 수 있다. The function comparison module 3315 may determine whether a single function of the previously registered device 2000 and a single function of the new device 2900 are the same or similar. The function comparison module 3315 may determine whether the function set of the previously registered device 2000 and the function set of the new device 2900 are the same or similar.

발화 데이터 획득 모듈(3316)은 신규 디바이스(2900)의 기능에 관련된 발화 데이터를 획득할 수 있다. 발화 데이터 획득 모듈(3316)은 기등록된 디바이스(2000)의 기능들 중에서 신규 디바이스(2900)의 기능과 동일 또는 유사하다고 판단된 기능에 대응되는 발화 데이터를 발화 데이터 DB(3341)로부터 추출할 수 있다. The speech data acquisition module 3316 may acquire speech data related to a function of the new device 2900. The speech data acquisition module 3316 may extract speech data corresponding to a function determined to be the same or similar to the function of the new device 2900 from the speech data DB 3331 among functions of the previously registered device 2000. have.

발화 데이터 획득 모듈(3316)은 기등록된 디바이스(2000)의 기능 세트들 중에서 신규 디바이스(2900)의 기능과 동일 또는 유사하다고 판단된 기능 세트에 대응되는 발화 데이터를 발화 데이터 DB(3341)로부터 추출할 수 있다. 이 경우, 기등록된 디바이스(2000)의 기능에 대응되는 발화 데이터, 및 기등록된 디바이스(2000)의 기능 세트에 대응되는 발화 데이터는, 발화 데이터 DB(3341)에 미리 저장되어 있을 수 있다.The speech data acquisition module 3316 extracts speech data corresponding to the function set determined to be the same or similar to the function of the new device 2900 from the speech data DB 3331 among the function sets of the previously registered device 2000 can do. In this case, speech data corresponding to a function of the previously registered device 2000 and speech data corresponding to a function set of the previously registered device 2000 may be previously stored in the speech data DB 341.

발화 데이터 획득 모듈(3316)은 동일 또는 유사하다고 판단된 기능 및 기능 세트를 편집하고 편집된 기능들에 대응되는 발화 데이터를 생성할 수도 있다. 발화 데이터 획득 모듈(3316)은 동일 또는 유사하다고 판단된 기능들을 조합하고, 조합된 기능들에 대응되는 발화 데이터를 생성할 수 있다. 또한, 발화 데이터 획득 모듈(3316)은 동일 또는 유사하다고 판단된 기능 및 기능 세트를 조합하고, 조합된 기능들에 대응되는 발화 데이터를 생성할 수 있다. 또한, 발화 데이터 획득 모듈(3316)은 동일 또는 유사하다고 판단된 기능 세트 내의 기능들 중 일부 기능을 삭제하고, 일부 기능이 삭제된 기능 세트에 대응되는 발화 데이터를 생성할 수 있다.The speech data acquisition module 3316 may edit a function and a set of functions determined to be the same or similar, and generate speech data corresponding to the edited functions. The speech data acquisition module 3316 may combine functions determined to be identical or similar, and generate speech data corresponding to the combined functions. Further, the speech data acquisition module 3316 may combine functions and function sets determined to be identical or similar, and generate speech data corresponding to the combined functions. Also, the speech data acquisition module 3316 may delete some functions among functions in the function set determined to be identical or similar, and generate speech data corresponding to the function set from which some functions are deleted.

발화 데이터 획득 모듈(3316)은 발화 데이터를 확장할 수 있다. 발화 데이터 획득 모듈(3316)은 추출 또는 생성된 발화 데이터의 표현을 수정함으로써, 추출 또는 생성된 발화 데이터와 의미는 동일하지만 상이한 표현을 가지는 유사 발화 데이터를 생성할 수 있다.The speech data acquisition module 3316 may expand speech data. The speech data acquisition module 3316 may generate similar speech data having the same meaning as the extracted or generated speech data but having different expressions by modifying the expression of the extracted or generated speech data.

발화 데이터 획득 모듈(3316)은 제1 NLG 모델(3313)을 이용하여, 추가 기능의 등록 및 발화 데이터의 생성 또는 편집을 위한 질의를 출력할 수 있다. 발화 데이터 획득 모듈(3316)은 신규 디바이스(2900)의 기능을 등록하고 발화 데이터의 생성을 안내하기 위한 안내 텍스트 또는 안내 음성 데이터를 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 발화 데이터 획득 모듈(3316)은 신규 디바이스(2900)의 기능들 중에서 기등록된 디바이스(2000)의 기능과 상이한 기능들의 목록을 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 발화 데이터 획득 모듈(3316)은 상이한 기능들 중 적어도 일부에 관련된 추천 발화 데이터를 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다.The speech data acquisition module 3316 may output a query for registering an additional function and generating or editing speech data using the first NLG model 3313. The speech data acquisition module 3316 registers the function of the new device 2900 and provides guidance text or guidance voice data for guiding the generation of speech data to the user's device 2000 or the developer's device (not shown). I can. The speech data acquisition module 3316 may provide a list of functions different from those of the previously registered device 2000 among functions of the new device 2900 to the user's device 2000 or the developer's device (not shown). have. The speech data acquisition module 3316 may provide recommended speech data related to at least some of different functions to the user's device 2000 or the developer's device (not shown).

발화 데이터 획득 모듈(3316)은 제1 NLU 모델(3312)을 이용하여 질의에 대한 응답을 해석할 수 있다. 발화 데이터 획득 모듈(3316)은 해석된 응답에 기초하여, 신규 디바이스(2900)의 기능들에 관련된 발화 데이터를 생성할 수 있다. 발화 데이터 획득 모듈(3316)은 해석된 사용자의 응답 또는 해석된 개발자의 응답을 이용하여 신규 디바이스(2900)의 기능들에 관련된 발화 데이터를 생성하고, 생성된 발화 데이터를 추천할 수 있다. 발화 데이터 획득 모듈(3316)은 신규 디바이스(2900)의 기능들 중 일부를 선택하고, 선택된 일부 기능 각각에 관련된 발화 데이터들을 생성할 수 있다. 발화 데이터 획득 모듈(3316)은 신규 디바이스(2900)의 기능들 중 일부를 선택하고, 선택된 일부 기능들의 조합에 관련된 발화 데이터를 생성할 수 있다. 발화 데이터 획득 모듈(3316)은 신규 디바이스(2900)의 기능들의 식별 값 및 속성에 기초하여, 제1 NLG 모델(3313)을 이용하여 신규 디바이스(2900)의 기능에 관련된 발화 데이터를 생성할 수 있다. The speech data acquisition module 3316 may analyze a response to the query by using the first NLU model 3312. The speech data acquisition module 3316 may generate speech data related to functions of the new device 2900 based on the analyzed response. The speech data acquisition module 3316 may generate speech data related to functions of the new device 2900 by using the interpreted user's response or the interpreted developer's response, and recommend the generated speech data. The speech data acquisition module 3316 may select some of functions of the new device 2900 and generate speech data related to each of the selected partial functions. The speech data acquisition module 3316 may select some of the functions of the new device 2900 and generate speech data related to a combination of some of the selected functions. The speech data acquisition module 3316 may generate speech data related to the function of the new device 2900 using the first NLG model 3313 based on the identification values and attributes of the functions of the new device 2900. .

액션 데이터 생성 모듈(3317)은 동일 또는 유사한 기능들 및 발화 데이터에 기초하여, 신규 디바이스(2900)에 대한 액션 데이터를 생성할 수 있다. 예를 들어, 발화 데이터에 대응되는 기능이 단일 기능인 경우에, 액션 데이터 생성 모듈(3317)은 단일 기능을 나타내는 세부 동작을 포함하는 액션 데이터를 생성할 수 있다. 예를 들어, 발화 데이터에 대응되는 기능이 기능 세트인 경우에, 액션 데이터 생성 모듈(3317)은 기능 세트 내의 기능들을 나타내는 세부 동작들, 및 세부 동작들의 실행 순서를 생성할 수 있다. 액션 데이터 생성 모듈(3317)은 신규 디바이스(2900)의 신규 기능과 관련하여 생성된 발화 데이터를 이용하여 액션 데이터를 생성할 수 있다. 액션 데이터 생성 모듈(3317)은 발화 데이터에 관련된 신규 디바이스(2900)의 신규 기능들을 식별하고, 식별된 기능들의 실행 순서를 결정함으로써, 생성된 발화 데이터에 대응되는 액션 데이터를 생성할 수 있다. 생성된 액션 데이터는 발화 데이터 및 유사 발화 데이터에 매칭될 수 있다.The action data generation module 3317 may generate action data for the new device 2900 based on the same or similar functions and speech data. For example, when the function corresponding to the utterance data is a single function, the action data generation module 3317 may generate action data including a detailed operation representing the single function. For example, when the function corresponding to the speech data is a function set, the action data generation module 3317 may generate detailed operations representing functions in the function set, and an execution order of the detailed operations. The action data generation module 3317 may generate action data using speech data generated in connection with a new function of the new device 2900. The action data generation module 3317 may generate action data corresponding to the generated speech data by identifying new functions of the new device 2900 related to the speech data and determining an execution order of the identified functions. The generated action data may be matched with speech data and similar speech data.

모델 업데이터(3318)는 발화 데이터 및 액션 데이터를 이용하여 신규 디바이스(2900)에 관련된 제2 보이스 어시스턴트 모델(3320)을 생성 또는 업데이트할 수 있다. 모델 업데이터(3318)는 신규 디바이스(2900)의 기능에 관련된 기 등록된 디바이스(2000)의 기능에 대응되는 발화 데이터, 신규 디바이스(2900)의 기능과 관련하여 신규로 생성된 발화 데이터, 확장된 발화 데이터 및 액션 데이터를 이용하여, 신규 디바이스(2900)에 관련된 제2 보이스 어시스턴트 모델(3320)을 생성 또는 업데이트할 수 있다. 모델 업데이터(3318)는 신규 디바이스(2900)에 관련된 발화 데이터 및 액션 데이터를 발화 데이터 DB(3341) 및 액션 데이터 DB(3342)에 누적하여 저장할 수 있다. 또한, 모델 업데이터(3318)는 액션 플랜 관리 모델(3323) 내에 포함된 캡슐 형태의 데이터베이스인 CAN(Concept Action Network)를 생성 또는 업데이트할 수 있다. The model updater 3318 may create or update a second voice assistant model 3320 related to the new device 2900 using speech data and action data. The model updater 3318 includes speech data corresponding to the functions of the previously registered device 2000 related to the functions of the new device 2900, speech data newly generated in relation to the functions of the new device 2900, and extended speech. Using the data and action data, a second voice assistant model 3320 related to the new device 2900 may be generated or updated. The model updater 3318 may accumulate and store speech data and action data related to the new device 2900 in the speech data DB 341 and the action data DB 3342. In addition, the model updater 3318 may create or update a CAN (Concept Action Network), which is a capsule-type database included in the action plan management model 3323.

제2 보이스 어시스턴트 모델(3320)은 특정 디바이스에 특화된 모델로, 사용자의 음성 입력에 대응하는 타겟 디바이스가 수행할 동작을 결정할 수 있다. 제2 보이스 어시스턴트 모델(3320)은 제2 NLU 모델(3321), 제2 NLG 모델(3322) 및 액션 플랜 관리 모델(3323)을 포함할 수 있다. 보이스 어시스턴트 서버(3000)는 디바이스의 타입 별로 제2 보이스 어시스턴트 모델(3320)을 포함할 수 있다.The second voice assistant model 3320 is a model specialized for a specific device and may determine an operation to be performed by a target device corresponding to a user's voice input. The second voice assistant model 3320 may include a second NLU model 3321, a second NLG model 3322, and an action plan management model 3323. The voice assistant server 3000 may include a second voice assistant model 3320 for each device type.

제2 NLU 모델(3321)은 특정 디바이스에 특화된 NLU 모델로서, 텍스트를 분석하고, 분석 결과에 기초하여 사용자의 의도에 관련된 제2 인텐트를 결정한다. 제2 NLU 모델(3321)은 디바이스의 기능을 고려하여 사용자의 입력 음성을 해석할 수 있다. 제2 NLU 모델(3321)은, 텍스트를 해석하여 텍스트에 대응하는 제2 인텐트를 획득하도록 학습된 모델일 수 있다. The second NLU model 3321 is an NLU model specialized for a specific device, and analyzes text and determines a second intent related to the user's intention based on the analysis result. The second NLU model 3321 may interpret a user's input voice in consideration of a device function. The second NLU model 3321 may be a model trained to obtain a second intent corresponding to the text by analyzing the text.

제2 NLG 모델(3322)은 특정 디바이스에 특화된 NLG 모델로서, 사용자에게 보이스 어시스턴트 서비스를 제공하기 위하여 필요한 질의 메시지를 생성할 수 있다. 제2 NLG 모델(3322)은 디바이스의 기능을 고려하여 사용자와의 대화를 위한 자연어를 생성할 수 있다.The second NLG model 3322 is an NLG model specialized for a specific device, and may generate a query message required to provide a voice assistant service to a user. The second NLG model 3322 may generate a natural language for conversation with a user in consideration of the function of the device.

액션 플랜 관리 모델(3323)은, 디바이스에 특화된 모델로서 사용자의 음성 입력에 대응하는 타겟 디바이스가 수행할 동작을 결정하는 모델일 수 있다. 액션 플랜 관리 모델(3323)은 신규 디바이스(2900)의 기능을 고려하여 신규 디바이스(2900)가 수행할 동작 정보를 플래닝할 수 있다.The action plan management model 3323 is a device-specific model and may be a model that determines an operation to be performed by a target device corresponding to a user's voice input. The action plan management model 3323 may plan operation information to be performed by the new device 2900 in consideration of the function of the new device 2900.

액션 플랜 관리 모델(3323)은 해석된 사용자의 발화 음성으로부터 신규 디바이스(2900)가 수행해야 할 세부 동작들을 선택하고 선택된 세부 동작들의 실행 순서를 플래닝할 수 있다. 액션 플랜 관리 모델(3323)은 플래닝 결과를 이용하여 신규 디바이스(2900)가 수행할 세부 동작에 관한 동작 정보를 획득할 수 있다. 동작 정보는, 디바이스가 수행할 세부 동작들, 세부 동작들 간의 연관 관계, 및 세부 동작들의 실행 순서와 관련된 정보일 수 있다. 동작 정보는 예를 들어, 세부 동작들의 수행을 위하여 신규 디바이스(2900)가 실행해야 할 기능들, 기능들의 실행 순서, 기능들을 실행하기 위하여 필요한 입력 값 및 기능들의 실행 결과로서 출력되는 출력 값을 포함할 수 있으나, 이에 한정되지 않는다.The action plan management model 3323 may select detailed operations to be performed by the new device 2900 from the interpreted user's spoken voice, and plan an execution order of the selected detailed operations. The action plan management model 3323 may acquire operation information about a detailed operation to be performed by the new device 2900 by using the planning result. The operation information may be information related to detailed operations to be performed by the device, a correlation relationship between detailed operations, and an execution order of the detailed operations. The operation information includes, for example, functions to be executed by the new device 2900 in order to perform detailed operations, an execution order of functions, an input value required to execute the functions, and an output value output as a result of execution of the functions. It can, but is not limited thereto.

액션 플랜 관리 모델(3323)은 신규 디바이스(2900)의 복수의 세부 동작들 및 복수의 세부 동작들 간의 관계에 관한 정보를 관리할 수 있다. 복수의 세부 동작들 중 각각의 세부 동작과 다른 세부 동작과의 연관 관계는, 하나의 세부 동작을 실행하기 위해서 그 세부 동작을 실행하기 전에 필수적으로 실행되어야 할 다른 세부 동작에 대한 정보를 포함할 수 있다.The action plan management model 3323 may manage information about a plurality of detailed operations of the new device 2900 and a relationship between the plurality of detailed operations. The relationship between each detailed operation and another detailed operation among a plurality of detailed operations may include information on other detailed operations that must be executed before executing the detailed operation in order to execute one detailed operation. have.

액션 플랜 관리 모델(3323)은 디바이스의 동작들 및 동작들 간의 연관 관계를 나타내는 캡슐 형태의 데이터베이스인 CAN(Concept Action Network)를 포함할 수 있다. CAN(Concept Action Network)은 특정 동작의 수행을 위하여 디바이스가 실행해야 할 기능들, 기능들의 실행 순서, 기능들을 실행하기 위하여 필요한 입력 값 및 기능들의 실행 결과로서 출력되는 출력 값을 포함하며, 컨셉 및 컨셉 간의 관계를 나타내는 지식 트리플들로 구성된 온톨로지 그래프로 구현될 수 있다. The action plan management model 3323 may include a concept action network (CAN), which is a database in the form of a capsule representing the actions of a device and a relationship between the actions. CAN (Concept Action Network) includes functions to be executed by the device to perform a specific operation, order of execution of functions, input values necessary to execute functions, and output values output as a result of execution of functions. It can be implemented as an ontology graph composed of knowledge triples representing the relationship between concepts.

SDK 인터페이스 모듈(3330)은 클라이언트 디바이스(1000) 또는 개발자의 디바이스(미도시)와 통신 인터페이스(3100)를 통하여 데이터를 송수신할 수 있다. 클라이언트 디바이스(1000) 또는 개발자의 디바이스(미도시)는 신규 디바이스의 등록을 위한 소정의 소프트웨어 개발 킷(SDK: Software Development Kit)을 설치할 수 있으며, 설치된 소프트웨어 개발 킷을 통하여 보이스 어시스턴트 서버(3000)로부터 GUI를 수신할 수 있다. 프로세서(3200)는 신규 디바이스(2900)의 기능을 등록하고 발화 데이터의 생성을 위한 GUI를, SDK 인터페이스 모듈(3330)을 통하여 사용자의 디바이스(2000) 또는 개발자의 디바이스(미도시)에게 제공할 수 있다. 프로세서(3200)는 사용자의 디바이스(2000)에게 제공된 GUI를 통한 사용자의 응답 입력을 사용자의 디바이스(2000)로부터 SDK 인터페이스 모듈(3330)을 통하여 수신하거나, 개발자의 디바이스(미도시)에게 제공된 GUI를 통한 개발자의 응답 입력을 SDK 인터페이스 모듈(3330)을 통하여 개발자의 디바이스(미도시)로부터 수신할 수 있다. SDK 인터페이스 모듈(3330)은 IoT 클라우드 서버(4000)와 통신 인터페이스(3100)를 통하여 데이터를 송수신할 수도 있다.The SDK interface module 3330 may transmit and receive data through the communication interface 3100 with the client device 1000 or a developer's device (not shown). The client device 1000 or the developer's device (not shown) can install a software development kit (SDK) for registering a new device, and from the voice assistant server 3000 through the installed software development kit. You can receive the GUI. The processor 3200 may register functions of the new device 2900 and provide a GUI for generating speech data to the user's device 2000 or the developer's device (not shown) through the SDK interface module 3330. have. The processor 3200 receives the user's response input through the GUI provided to the user's device 2000 from the user's device 2000 through the SDK interface module 3330, or provides a GUI provided to the developer's device (not shown). The developer's response input may be received from the developer's device (not shown) through the SDK interface module 3330. The SDK interface module 3330 may transmit and receive data through the IoT cloud server 4000 and the communication interface 3100.

DB(3340)는 보이스 어시스턴트 서비스를 위한 각종 정보를 저장할 수 있다. DB(3340)는 발화 데이터 DB(3341) 및 액션 데이터 DB(3342)를 포함할 수 있다.The DB 3340 may store various types of information for the voice assistant service. The DB 3340 may include a speech data DB 341 and an action data DB 3342.

발화 데이터 DB(3341)는 클라이언트 디바이스(1000), 디바이스(2000) 및 신규 디바이스(2900)의 기능들에 관련된 발화 데이터를 저장할 수 있다.The speech data DB 3341 may store speech data related to functions of the client device 1000, the device 2000, and the new device 2900.

액션 데이터 DB(3342)는 클라이언트 디바이스(1000), 디바이스(2000) 및 신규 디바이스(2900)의 기능들에 관련된 액션 데이터를 저장할 수 있다. 발화 데이터 DB에 저장된 발화 데이터 및 액션 데이터 DB(3342)에 저장된 액션 데이터는 서로 매핑될 수 있다.The action data DB 3342 may store action data related to functions of the client device 1000, the device 2000, and the new device 2900. The speech data stored in the speech data DB and the action data stored in the action data DB 3332 may be mapped to each other.

도 14는 본 개시의 일 실시예에 따른 보이스 어시스턴트 서버의 다른 예시를 나타내는 도면이다.14 is a diagram illustrating another example of a voice assistant server according to an embodiment of the present disclosure.

도 14를 참조하면, 보이스 어시스턴트 서버(3000)는 하나의 제2 보이스 어시스턴트 모델(3320)을 포함할 수 있다. 이 경우, 제2 보이스 어시스턴트 모델(3320)은 복수의 제2 NLU 모델들(3324, 3325, 3326)을 포함할 수 있다. 복수의 제2 NLU 모델들(3324, 3325, 3326)은 디바이스의 타입 별로 특화된 NLU 모델일 수 있다. Referring to FIG. 14, the voice assistant server 3000 may include one second voice assistant model 3320. In this case, the second voice assistant model 3320 may include a plurality of second NLU models 3324, 3325, and 3326. The plurality of second NLU models 3324, 3325, and 3326 may be NLU models specialized for each device type.

도 15는 본 개시의 일 실시예에 따른 액션 플랜 관리 모델을 도시한 개념도이다.15 is a conceptual diagram illustrating an action plan management model according to an embodiment of the present disclosure.

도 15를 참조하면, 액션 플랜 관리 모델(3323)은 스피커 CAN(212), 모바일 CAN(214), 및 TV CAN(216)을 포함할 수 있다. 스피커 CAN(212)은 스피커 제어, 미디어 재생, 날씨, 및 TV 제어를 포함하는 세부 동작들에 관한 정보와, 세부 동작들 각각에 대응되는 컨셉을 캡슐 형태로 저장하는 액션 플랜을 포함할 수 있다. 모바일 CAN(214)은 SNS, 모바일 제어, 지도, 및 QA를 포함하는 세부 동작들에 관한 정보와, 세부 동작들 각각에 대응되는 컨셉을 캡슐 형태로 저장하는 액션 플랜을 포함할 수 있다. TV CAN(216)은 쇼핑, 미디어 재생, 교육, 및 TV 재생을 포함하는 세부 동작들에 관한 정보와, 세부 동작들 각각에 대응되는 컨셉을 캡슐 형태로 저장하는 액션 플랜을 포함할 수 있다. 일 실시예에서, 스피커 CAN(212), 모바일 CAN(214), 및 TV CAN(216) 각각에 포함된 복수의 캡슐은 액션 플랜 관리 모델(3323) 내의 구성 요소인 기능 저장소(function registry)에 저장될 수 있다. Referring to FIG. 15, the action plan management model 3323 may include a speaker CAN 212, a mobile CAN 214, and a TV CAN 216. The speaker CAN 212 may include information on detailed operations including speaker control, media playback, weather, and TV control, and an action plan for storing concepts corresponding to each of the detailed operations in a capsule form. The mobile CAN 214 may include an action plan that stores information on detailed operations including SNS, mobile control, map, and QA, and concepts corresponding to each of the detailed operations in a capsule form. The TV CAN 216 may include information on detailed operations including shopping, media playback, education, and TV playback, and an action plan for storing concepts corresponding to each of the detailed operations in a capsule form. In one embodiment, a plurality of capsules included in each of the speaker CAN 212, the mobile CAN 214, and the TV CAN 216 are stored in a function registry, which is a component in the action plan management model 3323. Can be.

일 실시예에서, 액션 플랜 관리 모델(3323)은 보이스 어시스턴트 서버(3000)가 제2 NLU 모델을 통해, 음성 입력으로부터 변환된 텍스트를 해석함으로써 결정된 제2 인텐트 및 파라미터에 대응되는 세부 동작들을 결정할 때 필요한 전략 레지스트리(strategy registry)를 포함할 수 있다. 전략 레지스트리는 텍스트와 관련된 복수의 액션 플랜이 있는 경우, 하나의 액션 플랜을 결정하기 위한 기준 정보를 포함할 수 있다. 일 실시예에서, 액션 플랜 관리 모델(3323)은 지정된 상황에서 사용자에게 후속 동작을 제안하기 위한 후속 동작의 정보가 저장된 후속 동작 레지스트리(follow up registry)를 포함할 수 있다. 상기 후속 동작은, 예를 들어, 후속 발화를 포함할 수 있다. In one embodiment, the action plan management model 3323 determines detailed actions corresponding to the second intent and parameters determined by the voice assistant server 3000 interpreting the text converted from the voice input through the second NLU model. It may include a strategy registry that is necessary for this. When there are a plurality of action plans related to the text, the strategy registry may include reference information for determining one action plan. In one embodiment, the action plan management model 3323 may include a follow up registry in which information of a follow-up operation for suggesting a follow-up operation to a user in a specified situation is stored. The subsequent operation may include, for example, a subsequent speech.

일 실시예에서, 액션 플랜 관리 모델(3323)은 타겟 디바이스에 의해 출력되는 레이아웃(layout) 정보를 저장하는 레이아웃 레지스트리(layout registry)를 포함할 수 있다. In an embodiment, the action plan management model 3323 may include a layout registry that stores layout information output by the target device.

일 실시예에서, 액션 플랜 관리 모델(3323)은 캡슐 정보에 포함된 어휘(vocabulary) 정보가 저장된 어휘 레지스트리(vocabulary registry)를 포함할 수 있다. 일 실시예에서, 액션 플랜 관리 모델(3323)은 사용자와의 대화(dialog)(또는, 인터렉션(interaction)) 정보가 저장된 대화 레지스트리(dialog registry)를 포함할 수 있다.In one embodiment, the action plan management model 3323 may include a vocabulary registry in which vocabulary information included in capsule information is stored. In one embodiment, the action plan management model 3323 may include a dialog registry in which information about a dialog (or interaction) with a user is stored.

도 16은 본 개시의 일 실시예에 따른 액션 플랜 관리 모델에 저장된 캡슐 데이터베이스를 도시한 도면이다. 16 is a diagram illustrating a capsule database stored in an action plan management model according to an embodiment of the present disclosure.

도 16을 참조하면, 캡슐 데이터베이스는 세부 동작들과, 세부 동작들에 대응되는 컨셉에 관한 관계 정보가 저장되어 있다. 캡슐 데이터베이스는 CAN(Concept Action Network) 형태로 구현될 수 있다. 캡슐 데이터베이스는 복수의 캡슐(230, 240, 250)을 저장할 수 있다. 캡슐 데이터베이스는 사용자의 음성 입력과 관련된 동작들을 실행하기 위한 세부 동작, 및 세부 동작을 위해 필요한 입력 파라미터 값 및 출력 결과 값을 CAN(concept action network) 형태로 저장될 수 있다.Referring to FIG. 16, the capsule database stores detailed operations and relationship information about concepts corresponding to the detailed operations. The capsule database can be implemented in the form of a CAN (Concept Action Network). The capsule database may store a plurality of capsules 230, 240, 250. The capsule database may store detailed operations for executing operations related to a user's voice input, and input parameter values and output result values required for detailed operations in a concept action network (CAN) format.

캡슐 데이터베이스는 디바이스 별로 동작에 관련된 정보를 저장할 수 있다. 도 19에 도시된 실시예에서, 캡슐 데이터베이스는 특정 디바이스, 예를 들어 TV가 수행하는 동작들과 관련된 복수의 캡슐(230, 240, 250)을 저장할 수 있다. 일 실시예에서, 하나의 캡슐(예를 들어, 캡슐 A(230))은 하나의 애플리케이션에 대응될 수 있다. 하나의 캡슐은 지정된 기능을 수행하기 위한 적어도 하나 이상의 세부 동작 및 적어도 하나 이상의 컨셉을 포함할 수 있다. 예를 들어, 캡슐 A(230)의 경우 세부 동작(231a)와 세부 동작(231a)에 대응되는 컨셉(231b)을 포함하고, 캡슐 B(240)는 복수의 세부 동작들(241a, 242a, 243a)과, 복수의 세부 동작들(241a, 242a, 243a) 각각에 대응되는 복수의 컨셉(241b, 242b, 243b)을 포함할 수 있다. The capsule database may store information related to operation for each device. In the embodiment shown in FIG. 19, the capsule database may store a plurality of capsules 230, 240, 250 related to operations performed by a specific device, for example, a TV. In one embodiment, one capsule (eg, capsule A 230) may correspond to one application. One capsule may include at least one or more detailed operations and at least one or more concepts for performing a specified function. For example, capsule A 230 includes a detailed operation 231a and a concept 231b corresponding to the detailed operation 231a, and the capsule B 240 includes a plurality of detailed operations 241a, 242a, 243a. ), and a plurality of concepts 241b, 242b, and 243b corresponding to each of the plurality of detailed operations 241a, 242a, and 243a.

액션 플랜 관리 모델(210)은 캡슐 데이터베이스에 저장된 캡슐을 이용하여 사용자의 음성 입력과 관련된 동작을 수행하기 위한 액션 플랜을 생성할 수 있다. 예를 들어, 액션 플랜 관리 모델(210)은 캡슐 데이터베이스에 저장된 캡슐을 이용하여 액션 플랜을 생성할 수 있다. 예를 들어, 액션 플랜 관리 모델(210)은 캡슐 A(230)의 세부 동작(231a)과 컨셉(231b), 캡슐 B(240)의 세부 동작들(241a, 242a, 243a)과 컨셉들(241b, 242b, 243b) 및 캡슐 C(250)의 세부 동작(251a) 및 컨셉(251b)를 이용하여, 디바이스가 수행할 동작들에 관련된 액션 플랜(260)을 생성할 수 있다. The action plan management model 210 may generate an action plan for performing an operation related to a user's voice input by using a capsule stored in a capsule database. For example, the action plan management model 210 may generate an action plan using a capsule stored in a capsule database. For example, the action plan management model 210 includes detailed operations 231a and concept 231b of capsule A 230, detailed operations 241a, 242a, 243a and concepts 241b of capsule B 240. , 242b, 243b) and the detailed operation 251a and concept 251b of the capsule C 250, an action plan 260 related to operations to be performed by the device may be generated.

도 17은 본 개시의 일 실시예에 따른 IoT 클라우드 서버의 블록도이다.17 is a block diagram of an IoT cloud server according to an embodiment of the present disclosure.

도 17을 참조하면, 본 개시의 일 실시예에 따른 IoT 클라우드 서버(4000)는 통신 인터페이스(4100), 프로세서(4200) 및 저장부(4300)를 포함하며, 저장부(4300)는 SDK 인터페이스 모듈(4310), 기능 비교 모듈(4320), 디바이스 등록 모듈(4330), 및 DB(4340)를 포함할 수 있다. 또한, DB(4340)는 디바이스 기능 DB(4341) 및 액션 데이터 DB(4342)를 포함할 수 있다.Referring to FIG. 17, the IoT cloud server 4000 according to an embodiment of the present disclosure includes a communication interface 4100, a processor 4200, and a storage unit 4300, and the storage unit 4300 is an SDK interface module. 4310, a function comparison module 4320, a device registration module 4330, and a DB 4340 may be included. Further, the DB 4340 may include a device function DB 441 and an action data DB 4432.

통신 인터페이스(4100)는, 클라이언트 디바이스(1000), 디바이스(2000) 및 보이스 어시스턴트 서버(3000)와 통신을 수행한다. 통신 인터페이스(4100)는 클라이언트 디바이스(1000), 디바이스(2000) 및 보이스 어시스턴트 서버(3000)와 통신을 위한 하나 이상의 구성요소를 포함할 수 있다.The communication interface 4100 communicates with the client device 1000, the device 2000, and the voice assistant server 3000. The communication interface 4100 may include one or more components for communication with the client device 1000, the device 2000, and the voice assistant server 3000.

프로세서(4200)는 통상적으로 IoT 클라우드 서버(4000)의 전반적인 동작을 제어한다. 예를 들어, 프로세서(4200)는, 저장부(4300)에 저장된 프로그램들을 실행함으로써, 본 명세서에서의 IoT 클라우드 서버(4000)의 기능을 제어할 수 있다.The processor 4200 typically controls the overall operation of the IoT cloud server 4000. For example, the processor 4200 may control functions of the IoT cloud server 4000 in the present specification by executing programs stored in the storage unit 4300.

저장부(4300)는 프로세서(4200)의 처리 및 제어를 위한 프로그램을 저장할 수 있고, 디바이스(2000)의 기능과 관련된 데이터를 저장할 수 있다. 저장부(4300)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The storage unit 4300 may store a program for processing and control of the processor 4200 and may store data related to functions of the device 2000. The storage unit 4300 includes a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg, SD or XD memory, etc.), RAM (Random Access Memory) SRAM (Static Random Access Memory), ROM (ROM, Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic It may include at least one type of storage medium among a disk and an optical disk.

저장부(4300)는 에 저장된 프로그램들은 그 기능에 따라 복수 개의 모듈들로 분류할 수 있는데, 예를 들어, SDK 인터페이스 모듈(4310), 기능 비교 모듈(4320), 및 디바이스 등록 모듈(4330) 등으로 분류될 수 있다.The programs stored in the storage unit 4300 can be classified into a plurality of modules according to their functions, for example, the SDK interface module 4310, the function comparison module 4320, and the device registration module 4330. It can be classified as

SDK 인터페이스 모듈(4310)은 보이스 어시스턴트 서버(3000)와 통신 인터페이스(4100)를 통하여 데이터를 송수신할 수 있다. 프로세서(4200)는 디바이스(2000)의 기능 정보를, SDK 인터페이스 모듈(4310)을 통하여 보이스 어시스턴트 서버(3000)에게 제공할 수 있다. The SDK interface module 4310 may transmit and receive data through the voice assistant server 3000 and the communication interface 4100. The processor 4200 may provide functional information of the device 2000 to the voice assistant server 3000 through the SDK interface module 4310.

기능 비교 모듈(4320)이 IoT 클라우드 서버(4000) 내에 포함되는 경우에, 기능 비교 모듈(4320)은 전술한 보이스 어시스턴트 서버(3000)의 기능 비교 모델(3315)의 역할을 수행할 수 있다.When the function comparison module 4320 is included in the IoT cloud server 4000, the function comparison module 4320 may serve as the function comparison model 3315 of the voice assistant server 3000 described above.

이 경우, 기능 비교 모듈(4320)은 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능을 비교할 수 있다. 기능 비교 모듈(4320)은 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 동일 또는 유사한지를 판단할 수 있다. 기능 비교 모듈(4320)은 신규 디바이스(2900)의 기능들 중에서 기등록된 디바이스(2000)의 기능과 동일 또는 유사한 기능을 식별할 수 있다.In this case, the function comparison module 4320 may compare the function of the previously registered device 2000 and the function of the new device 2900. The function comparison module 4320 may determine whether a function of the previously registered device 2000 and a function of the new device 2900 are the same or similar. The function comparison module 4320 may identify a function identical or similar to a function of the previously registered device 2000 among functions of the new device 2900.

기능 비교 모듈(4320)은 신규 디바이스(2900)의 명세 정보로부터 신규 디바이스(2900)에 의해 지원되는 기능을 나타내는 명칭을 식별하고, 식별된 명칭이 기등록된 디바이스(2000)에 의해 지원되는 기능의 명칭과 동일 또는 유사한 지를 판단할 수 있다. 이 경우, DB(4340)는 소정 기능을 나타내는 명칭 및 유사어들에 관한 정보를 미리 저장할 수 있으며, 저장된 유사어 정보에 기초하여 기등록된 디바이스(2000)의 기능과 신규 디바이스(2900)의 기능이 동일 또는 유사한지를 판단할 수 있다.The function comparison module 4320 identifies a name representing a function supported by the new device 2900 from the specification information of the new device 2900, and the identified name is of the function supported by the previously registered device 2000. It can be determined whether it is the same or similar to the name. In this case, the DB 4340 may previously store information on names and similar words representing a predetermined function, and the functions of the previously registered device 2000 and the new device 2900 are the same based on the stored similar word information. Or you can judge whether they are similar.

또한, 기능 비교 모듈(4320)은 DB(4340)에 저장된 발화 데이터를 참고하여 기능의 동일 유사 여부를 판단할 수 있다. 기능 비교 모듈(3315)은 기등록된 디바이스(2000)의 기능과 관련된 발화 데이터를 이용하여, 신규 디바이스(2900)의 기능이 기등록된 디바이스(2000)의 기능과 동일 또는 유사한지를 판단할 수 있다. 기능 비교 모듈(4320)은 기등록된 디바이스(2000)의 단일 기능과 신규 디바이스(2900)의 단일 기능이 동일 또는 유사한 지를 판단할 수 있다. 기능 비교 모듈(4320)은 기등록된 디바이스(2000)의 기능 세트와 신규 디바이스(2900)의 기능 세트가 동일 또는 유사한 지를 판단할 수 있다.In addition, the function comparison module 4320 may determine whether or not the functions are identical or similar by referring to speech data stored in the DB 4340. The function comparison module 3315 may determine whether the function of the new device 2900 is the same as or similar to the function of the previously registered device 2000 by using speech data related to the function of the previously registered device 2000. . The function comparison module 4320 may determine whether a single function of the previously registered device 2000 and a single function of the new device 2900 are the same or similar. The function comparison module 4320 may determine whether the function set of the previously registered device 2000 and the function set of the new device 2900 are the same or similar.

디바이스 등록 모듈(4330)은 보이스 어시스턴트 서비스를 위해 디바이스를 등록할 수 있다. 디바이스 등록 모듈(4330)은 신규 디바이스(2900)가 식별되는 경우에, 신규 디바이스(2900)의 기능들에 관한 정보를 보이스 어시스턴트 서버(3000)로부터 수신하고, 수신된 정보를 DB(4340)에 등록할 수 있다. 신규 디바이스(2900)의 기능들에 관한 정보는, 예를 들어, 신규 디바이스(2900)에 의해 지원되는 기능들, 기능들에 관련된 액션 데이터 등을 포함할 수 있으나, 이에 제한되지 않는다.The device registration module 4330 may register a device for voice assistant service. When a new device 2900 is identified, the device registration module 4330 receives information on functions of the new device 2900 from the voice assistant server 3000, and registers the received information in the DB 4340. can do. The information on the functions of the new device 2900 may include, for example, functions supported by the new device 2900, action data related to the functions, but is not limited thereto.

DB(4340)는 보이스 어시스턴트 서비스를 위해 필요한 디바이스 정보를 저장할 수 있다. DB(4340)는 디바이스 기능 DB(4341) 및 액션 데이터 DB(4342)를 포함할 수 있다. 디바이스 기능 DB(4340)는 클라이언트 디바이스(1000), 디바이스(2000) 및 신규 디바이스(2900)의 기능 정보를 저장할 수 있다. 기능 정보는 디바이스의 기능의 식별 값, 기능의 명칭 및 기능의 속성에 관한 정보를 포함할 수 있으나, 이에 제한되지 않는다. 액션 데이터 DB(4342)는 클라이언트 디바이스(1000), 디바이스(2000) 및 신규 디바이스(2900)의 기능들에 관련된 액션 데이터를 저장할 수 있다.The DB 4340 may store device information necessary for the voice assistant service. The DB 4340 may include a device function DB 441 and an action data DB 4432. The device function DB 4340 may store function information of the client device 1000, the device 2000, and the new device 2900. The function information may include a device function identification value, a function name, and information on a function property, but is not limited thereto. The action data DB 4432 may store action data related to functions of the client device 1000, the device 2000, and the new device 2900.

도 18은 본 개시의 일 실시예에 따른 클라이언트 디바이스의 블록도이다.18 is a block diagram of a client device according to an embodiment of the present disclosure.

도 18을 참조하면, 본 개시의 일 실시예에 따른 클라이언트 디바이스(1000)는 입력부(1100), 출력부(1200), 프로세서(1300), 메모리(1400) 및 통신 인터페이스(1500)를 포함하며, 메모리(1400)는 SDK 모듈(1420)을 포함할 수 있다.Referring to FIG. 18, a client device 1000 according to an embodiment of the present disclosure includes an input unit 1100, an output unit 1200, a processor 1300, a memory 1400, and a communication interface 1500, The memory 1400 may include an SDK module 1420.

일 실시예에 따르면, 디바이스(2000)가 클라이언트 디바이스(1000)로 동작하거나, 신규 디바이스(2900)가 등록된 이후에 등록된 신규 디바이스(2900)가 클라이언트 디바이스(1000)로 동작할 수도 있다. 디바이스(2000) 또는 신규 디바이스(2900)가 도 18의 구성들을 포함할 수도 있다.According to an embodiment, the device 2000 may operate as the client device 1000, or the new device 2900 registered after the new device 2900 is registered may operate as the client device 1000. Device 2000 or new device 2900 may include the configurations of FIG. 18.

입력부(1100)는, 사용자가 클라이언트 디바이스(1000)를 제어하기 위한 데이터를 입력하는 수단을 의미한다. 예를 들어, 입력부(1100)에는 키 패드(key pad), 돔 스위치 (dome switch), 터치 패드(접촉식 정전 용량 방식, 압력식 저항막 방식, 적외선 감지 방식, 표면 초음파 전도 방식, 적분식 장력 측정 방식, 피에조 효과 방식 등), 조그 휠, 조그 스위치 등이 있을 수 있으나 이에 한정되는 것은 아니다.The input unit 1100 refers to a means for a user to input data for controlling the client device 1000. For example, the input unit 1100 includes a key pad, a dome switch, and a touch pad (contact type capacitance method, pressure type resistive film method, infrared detection method, surface ultrasonic conduction method, integral tension type). Measurement method, piezo effect method, etc.), a jog wheel, a jog switch, and the like, but are not limited thereto.

입력부(1100)는 신규 디바이스(2900)를 등록하기 위한 사용자 입력을 수신할 수 있다. The input unit 1100 may receive a user input for registering the new device 2900.

출력부(1200)는, 오디오 신호 또는 비디오 신호 또는 진동 신호를 출력할 수 있으며, 출력부(1200)는 디스플레이부, 음향 출력부, 또는 진동 모터 중 적어도 하나를 포함할 수 있다.The output unit 1200 may output an audio signal, a video signal, or a vibration signal, and the output unit 1200 may include at least one of a display unit, an audio output unit, and a vibration motor.

프로세서(1300)는, 클라이언트 디바이스(1000)의 전반적인 동작을 제어한다. 예를 들어, 프로세서(1300)는, 메모리(1400)에 저장된 프로그램들을 실행함으로써, 입력부(1100), 출력부(1200), 메모리(1400) 및 통신 인터페이스(1500) 등을 전반적으로 제어할 수 있다. The processor 1300 controls the overall operation of the client device 1000. For example, the processor 1300 may generally control the input unit 1100, the output unit 1200, the memory 1400, and the communication interface 1500 by executing programs stored in the memory 1400. .

프로세서(1300)는, 신규 디바이스(2900)의 기능을 등록하기 위한 입력을 사용자에게 요청할 수 있다. 프로세서(1300)는 후술할 SDK 모듈(1420)을 제어함으로써, 신규 디바이스(2900)의 등록을 위한 동작을 보이스 어시스턴트 서버(300)와 함께 수행할 수 있다.The processor 1300 may request an input for registering a function of the new device 2900 from the user. The processor 1300 may perform an operation for registering a new device 2900 together with the voice assistant server 300 by controlling the SDK module 1420 to be described later.

프로세서(1300)는 신규 디바이스(2900) 기능에 관련된 발화 데이터를 생성하고 편집하기 위한 질의 메시지를 보이스 어시스턴트 서버(3000)로부터 수신하여 출력할 수 있다. 프로세서(1300)는 신규 디바이스(2900)의 기능들 중에서 기등록된 디바이스(1300)의 기능과 상이한 기능들의 목록을 사용자에게 제공할 수 있다. 프로세서(1300)는 신규 디바이스(2900)의 기능들 중 적어도 일부에 관련된 추천 발화 데이터를 사용자에게 제공할 수 있다.The processor 1300 may receive and output a query message for generating and editing speech data related to a function of the new device 2900 from the voice assistant server 3000. The processor 1300 may provide a user with a list of functions different from those of the previously registered device 1300 among functions of the new device 2900. The processor 1300 may provide the user with recommended speech data related to at least some of the functions of the new device 2900.

프로세서(1300)는 질의 메시지에 대한 사용자의 응답을 수신할 수 있다. 프로세서(1300)는 사용자의 응답을 보이스 어시스턴트 서버(3000)에게 제공하여, 보이스 어시스턴트 서버(3000)가 신규 디바이스(2900)의 기능들에 관련된 발화 데이터 및 액션 데이터를 생성하도록 할 수 있다.The processor 1300 may receive a user's response to the query message. The processor 1300 may provide the user's response to the voice assistant server 3000 to allow the voice assistant server 3000 to generate speech data and action data related to functions of the new device 2900.

통신 인터페이스(1500)는, 보이스 어시스턴트 서버(3000), IoT 클라우드 서버(4000), 디바이스(2000) 및 신규 디바이스(2900)와 통신을 하게 하는 하나 이상의 구성요소를 포함할 수 있다. 예를 들어, 통신 인터페이스(1500)는, 근거리 통신부, 이동 통신부, 방송 수신부를 포함할 수 있다. The communication interface 1500 may include one or more components for communicating with the voice assistant server 3000, the IoT cloud server 4000, the device 2000, and the new device 2900. For example, the communication interface 1500 may include a short range communication unit, a mobile communication unit, and a broadcast reception unit.

근거리 통신부(short-range wireless communication unit)는, 블루투스 통신부, BLE(Bluetooth Low Energy) 통신부, 근거리 무선 통신부(Near Field Communication unit), WLAN(와이파이) 통신부, 지그비(Zigbee) 통신부, 적외선(IrDA, infrared Data Association) 통신부, WFD(Wi-Fi Direct) 통신부, UWB(ultra wideband) 통신부, Ant+ 통신부 등을 포함할 수 있으나, 이에 한정되는 것은 아니다. The short-range wireless communication unit includes a Bluetooth communication unit, a Bluetooth Low Energy (BLE) communication unit, a Near Field Communication unit, a WLAN (Wi-Fi) communication unit, a Zigbee communication unit, an infrared (IrDA, infrared) communication unit. Data Association) communication unit, WFD (Wi-Fi Direct) communication unit, UWB (ultra wideband) communication unit, and may include an Ant+ communication unit, but is not limited thereto.

이동 통신부는, 이동 통신망 상에서 기지국, 외부의 단말, 서버 중 적어도 하나와 무선 신호를 송수신한다. 여기에서, 무선 신호는, 음성 호 신호, 화상 통화 호 신호 또는 문자/멀티미디어 메시지 송수신에 따른 다양한 형태의 데이터를 포함할 수 있다.The mobile communication unit transmits and receives a radio signal with at least one of a base station, an external terminal, and a server on a mobile communication network. Here, the wireless signal may include a voice call signal, a video call signal, or various types of data according to transmission/reception of text/multimedia messages.

방송 수신부는, 방송 채널을 통하여 외부로부터 방송 신호 및/또는 방송 관련된 정보를 수신한다. 방송 채널은 위성 채널, 지상파 채널을 포함할 수 있다. 구현 예에 따라서 제 1 디바이스(1000)가 방송 수신부(1530)를 포함하지 않을 수도 있다.The broadcast receiver receives a broadcast signal and/or broadcast-related information from outside through a broadcast channel. Broadcast channels may include satellite channels and terrestrial channels. According to an implementation example, the first device 1000 may not include the broadcast receiver 1530.

메모리(1400)는, 프로세서(1300)의 처리 및 제어를 위한 프로그램을 저장할 수 있고, 클라이언트 디바이스(1000)로 입력되거나 클라이언트 디바이스(1000)로부터 출력되는 데이터를 저장할 수도 있다. The memory 1400 may store a program for processing and controlling the processor 1300 and may store data input to the client device 1000 or output from the client device 1000.

메모리(1400)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The memory 1400 is a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (for example, SD or XD memory), and RAM. (RAM, Random Access Memory) SRAM (Static Random Access Memory), ROM (ROM, Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk And at least one type of storage medium among optical disks.

메모리(1400)에 저장된 프로그램들은 그 기능에 따라 복수 개의 모듈들로 분류할 수 있는데, 예를 들어, SDK 모듈(1420), UI 모듈(미도시), 터치 스크린 모듈(미도시), 알림 모듈(미도시) 등으로 분류될 수 있다.Programs stored in the memory 1400 can be classified into a plurality of modules according to their functions. For example, the SDK module 1420, a UI module (not shown), a touch screen module (not shown), and a notification module ( Not shown), etc.

SDK 모듈(1420)은 프로세서(1300)에 의해 실행되어 신규 디바이스(2900)의 등록을 위해 필요한 동작을 수행할 수 있다. SDK 모듈(1420)은 보이스 어시스턴트 서버(3000)로부터 다운로드되어 클라이언트 디바이스(1000) 내에 설치될 수 있다. SDK 모듈(1420)은 신규 디바이스(2900)의 등록을 위한 GUI를 클라이언트 디바이스(1000)의 화면 상에 출력할 수 있다. 만약, 클라이언트 디바이스(1000)가 디스플레이 장치를 포함하지 않는 장치인 경우에, SDK 모듈(1420)은 클라이언트 디바이스(1000)가 신규 디바이스(2900)의 등록을 위한 음성 메시지를 출력하도록 할 수 있다. SDK 모듈(1420)은 클라이언트 디바이스(1000)가 사용자로부터의 응답을 수신하여 보이스 어시스턴트 서버(3000)에게 제공하도록 할 수 있다.The SDK module 1420 may be executed by the processor 1300 to perform an operation required to register a new device 2900. The SDK module 1420 may be downloaded from the voice assistant server 3000 and installed in the client device 1000. The SDK module 1420 may output a GUI for registering the new device 2900 on the screen of the client device 1000. If the client device 1000 is a device that does not include a display device, the SDK module 1420 may cause the client device 1000 to output a voice message for registration of the new device 2900. The SDK module 1420 may allow the client device 1000 to receive a response from a user and provide it to the voice assistant server 3000.

본 개시의 일 실시예는 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체 및 통신 매체를 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 또는 프로그램 모듈과 같은 변조된 데이터 신호의 기타 데이터를 포함할 수 있다. An embodiment of the present disclosure may also be implemented in the form of a recording medium including instructions executable by a computer, such as a program module executed by a computer. Computer-readable media can be any available media that can be accessed by a computer, and includes both volatile and nonvolatile media, removable and non-removable media. Further, the computer-readable media may include computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media may typically contain computer readable instructions, data structures, or other data in a modulated data signal such as a program module.

또한, 본 명세서에서, “부”는 프로세서 또는 회로와 같은 하드웨어 구성(hardware component), 및/또는 프로세서와 같은 하드웨어 구성에 의해 실행되는 소프트웨어 구성(software component)일 수 있다.In addition, in the present specification, the “unit” may be a hardware component such as a processor or a circuit, and/or a software component executed by a hardware configuration such as a processor.

또한, 본 명세서에서, “a, b 또는 c 중 적어도 하나를 포함한다”는 “a만 포함하거나, b만 포함하거나, c만 포함하거나, a 및 b를 포함하거나, b 및 c를 포함하거나, a 및 c를 포함하거나, a, b 및 c를 모두 포함하는 것을 의미할 수 있다.In addition, in the present specification, “including at least one of a, b, or c” means “including only a, only b, only c, including a and b, or including b and c, It may mean including a and c, or including all of a, b and c.

전술한 본 개시의 설명은 예시를 위한 것이며, 본 개시가 속하는 기술분야의 통상의 지식을 가진 자는 본 개시의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description of the present disclosure is for illustrative purposes only, and those skilled in the art to which the present disclosure pertains will be able to understand that it is possible to easily transform it into other specific forms without changing the technical spirit or essential features of the present disclosure will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not limiting. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as being distributed may also be implemented in a combined form.

본 개시의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 개시의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present disclosure is indicated by the claims to be described later rather than the detailed description, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present disclosure. do.

Claims

In the method for the server to register a new device for voice assistant service,
Acquiring specification information indicating functions of at least one pre-registered device;
Comparing functions of the previously registered device and functions of the new device based on the specification information;
Identifying functions corresponding to functions of the previously registered device among functions of the new device based on the comparison result;
Obtaining pre-registered speech data related to at least some of the identified functions;
Generating action data for the new device based on the identified functions and the acquired pre-registered speech data; And
Storing the obtained speech data and the generated action data in association with the new device;
Including,
The action data includes data on a series of detailed functions of the new device corresponding to the acquired speech data.

The method of claim 1,
The operation of identifying the corresponding functions,
Among the functions of the new device, functions of the same or similar purpose as those of the previously registered device are identified.

The method of claim 1,
The comparing operation,
Identifying a combination of functions of the at least one pre-registered device corresponding to the combination of functions of the new device by comparing a combination of functions of the at least one previously registered device with a combination of functions of the new device Being, the way.

The method of claim 3,
The at least one pre-registered device includes a plurality of pre-registered devices,
The comparing operation,
And comparing a combination of a function of a first device among the plurality of pre-registered devices and a function of a second device among the plurality of pre-registered devices with a combination of functions of the new device.

The method of claim 1,
The comparing operation,
And deleting some of the functions in the function set of the at least one pre-registered device, and comparing the function remaining after the deletion with the function of the new device.

The method of claim 1,
The operation of obtaining the pre-registered speech data may include: extracting speech data corresponding to the identified functions from a DB; And
Editing the extracted speech data;
The method comprising a.

The method of claim 1,
Identifying functions different from those of the previously registered device among functions of the new device; And
Providing a list of the different functions to a client device;
The method further comprising.

The method of claim 7,
The method, wherein the list of different functions provided to the client device is used to generate speech data corresponding to the different functions by an SDK (Software Developing kit) module installed in the client device.

The method of claim 1,
The obtained speech data and the generated action data are used to create or update a voice assistant model specialized for the new device.

The method of claim 1,
The obtained speech data and the generated action data are used to generate or update an ontology graph including knowledge triples representing actions of the new device and relationships between actions.

In the server for registering a new device for voice assistant service,
Communication interface;
A memory storing a program including one or more instructions; And
A processor that executes one or more instructions of a program stored in the memory; Including,
The processor,
Obtaining specification information representing functions of at least one pre-registered device, comparing functions of the pre-registered device and functions of the new device based on the specification information, and based on the comparison result , Among the functions of the new device, identify functions corresponding to the functions of the previously registered device, obtain pre-registered speech data related to at least some of the identified functions, and the identified functions and the Based on the acquired pre-registered speech data, action data for the new device is generated, and the obtained speech data and the generated action data are stored in a predetermined DB (DataBase) in association with the new device,
The action data includes data on a series of detailed functions of the new device corresponding to the acquired speech data.

The method of claim 11,
The processor, among the functions of the new device, identifies functions of the same or similar purpose as the functions of the previously registered device.

The method of claim 11,
The processor, by comparing the combination of functions of the at least one previously registered device with the combination of functions of the new device, the combination of functions of the at least one pre-registered device corresponding to the combination of functions of the new device To identify the server.

The method of claim 13,
The at least one pre-registered device includes a plurality of pre-registered devices,
Wherein the processor compares a combination of a function of a first device among the plurality of pre-registered devices and a function of a second device among the plurality of pre-registered devices with a combination of functions of the new device.

The method of claim 11,
Wherein the processor deletes some of the functions in the function set of the at least one pre-registered device, and compares the function remaining after the deletion with the function of the new device.

The method of claim 11,
The processor, the operation of obtaining the pre-registered speech data, extracts speech data corresponding to the identified functions from the DB, and edits the extracted speech data.

The method of claim 11,
The processor, among functions of the new device, identifies functions different from those of the previously registered device, and provides a list of the different functions to a client device.

The method of claim 17,
The list of the different functions provided to the client device is used to generate speech data corresponding to the different functions by a software developing kit (SDK) module installed in the client device.

The method of claim 11,
The obtained speech data and the generated action data are used to create or update a voice assistant model specialized for the new device.

A computer-readable recording medium storing a program for executing the method of claim 1 on a computer.