KR20200058612A

KR20200058612A - Artificial intelligence speaker and talk progress method using the artificial intelligence speaker

Info

Publication number: KR20200058612A
Application number: KR1020180138817A
Authority: KR
Inventors: 이현정
Original assignee: 엔에이치엔 주식회사
Priority date: 2018-11-13
Filing date: 2018-11-13
Publication date: 2020-05-28

Abstract

Disclosed are an artificial intelligence speaker and a conversation progress method using the same, capable of performing a function of reading information necessary for a user. The artificial intelligence speaker includes a microphone unit, a speaker unit, and an artificial intelligence unit. The microphone unit receives a voice of an arbitrary user to output a voice signal. The speaker unit outputs a sound. The artificial intelligence unit performs a conversation process of analyzing the voice signal output from the microphone unit to extract a voice input sentence and generating an answer sentence according to the voice input sentence to output the generated answer sentence through the speaker unit, and when a read request command for an arbitrary text (hereinafter referred to as ′target text′) is included in the voice input sentence while performing the conversation process, performs a read process of outputting the target text through the speaker unit. As described above, since the artificial intelligence speaker reads the target text, the user is able to easily recognize content of the target text.

Description

ARTIFICIAL INTELLIGENCE SPEAKER AND TALK PROGRESS METHOD USING THE ARTIFICIAL INTELLIGENCE SPEAKER}

본 발명은 인공지능 스피커 및 이를 이용한 대화 진행 방법에 관한 것으로, 보다 상세하게는 사용자와 대화를 수행할 수 있는 인공지능 스피커 및 이를 이용한 대화 진행 방법에 관한 것이다.The present invention relates to an artificial intelligence speaker and a method of conducting a conversation using the same, and more particularly, to an artificial intelligence speaker capable of performing a conversation with a user and a method of conducting a conversation using the same.

음성 인식 디바이스는 인간의 일상적인 음성을 인식하고 인식된 음성에 따라 업무를 수행하게 하는 것을 말한다. 이러한 음성 인식의 기술은 컴퓨터와 정보 통신의 발달로 인해 인간이 직접 움직이지 않고서도 원거리에서 정보를 손쉽게 얻을 수 있으며, 음성에 따라 작동하는 시스템으로 이루어진 기기들의 개발로 이어지고 있다.The speech recognition device refers to recognizing a human's daily speech and performing work according to the recognized speech. Due to the development of computer and information communication, this technology of speech recognition can easily obtain information from a long distance without a direct movement of human beings, and has led to the development of devices consisting of systems that operate according to speech.

이러한 음성 인식 기술을 바탕으로 다양한 음성 인식 응용 시스템이 개발되고 있고, 그 중의 하나는 사용자가 발성한 언어에 따라 원하는 정보를 안내하는 시스템이다. 예를 들어, 어느 단체의 전화 번호 안내 시스템이 있다고 가정할 때, 사용자가 찾고자 하는 부서의 명칭을 음성으로 발성을 하게 되면 해당 부서의 전화번호를 사용자의 모니터 상에 디스플레이하는 시스템을 생각할 수 있다.Various speech recognition application systems have been developed based on the speech recognition technology, and one of them is a system that guides information desired according to a language spoken by a user. For example, assuming that there is a telephone number guidance system of a group, if a user speaks out the name of the department he or she wants to find, it is possible to think of a system that displays the phone number of the department on the user's monitor.

또한, 최근에는 간단한 대화가 가능한 인공지능 스피커가 개발되고 있다. 이러한 인공지능 스피커는 사용자의 음성을 인식하여 사용자의 명령 내용을 파악한 후 해당 명령 내용에 따라 반응하는 스피커 장치를 말한다. 예를 들어, 한국공개특허 제10-2018-0046550호(2018년 5월 9일 공개)에는 '인간과 쌍방향 소통이 가능한 인공지능을 이용한 대화 장치 및 방법'이 개시되어 있다.Also, recently, artificial intelligence speakers capable of simple conversation have been developed. The artificial intelligence speaker refers to a speaker device that recognizes a user's voice, recognizes the user's command, and responds according to the command. For example, Korean Patent Publication No. 10-2018-0046550 (published on May 9, 2018) discloses 'a conversation apparatus and method using artificial intelligence capable of interactive communication with humans'.

그러나, 지금까지의 인공지능 스피커는 사용자의 단순 명령어에 반응하는 정도의 인공지능을 갖고 있다. 예를 들어, 사용자가 음악 A를 들려달라고 명령할 때, 상기 음악 A를 로딩하여 소리로 출력하거나, 현재 시간의 라디오 프로그램을 들려달라고 명령할 때, 해당 라디오 프로그램을 실시간으로 제공받아 출력하고 있다. 따라서, 이러한 인공지능 스피커에 새로운 기능을 부여하여 다양한 분야에 적용할 필요가 있다.However, up to now, the artificial intelligence speaker has an artificial intelligence that responds to the user's simple command. For example, when a user instructs to listen to music A, when loading the music A and outputting it as a sound, or when instructing to listen to a radio program of the current time, the radio program is provided and output in real time. Therefore, it is necessary to apply new functions to these artificial intelligence speakers and apply them to various fields.

따라서, 본 발명은 이러한 문제점을 해결하고자 도출된 것으로, 본 발명이 해결하고자 하는 과제는 사용자에게 필요한 정보를 읽어줄 수 있는 기능을 수행할 수 있는 인공지능 스피커를 제공하는 것이다.Therefore, the present invention has been drawn to solve this problem, and the problem to be solved by the present invention is to provide an artificial intelligence speaker capable of performing a function capable of reading information necessary for a user.

또한, 본 발명이 해결하고자 하는 다른 과제는 상기 인공지능 스피커를 이용한 대화 진행 방법을 제공하는 것이다.In addition, another problem to be solved by the present invention is to provide a method for proceeding a conversation using the artificial intelligence speaker.

본 발명의 일 실시예에 따른 인공지능 스피커는 마이크부, 스피커부 및 인공지능부를 포함한다. 상기 마이크부는 임의의 사용자의 음성을 입력받아 음성 신호를 출력한다. 상기 스피커부는 소리를 출력시킬 수 있다. 상기 인공지능부는 상기 마이크부에서 출력되는 상기 음성 신호를 분석하여 음성 입력 문장을 추출한 후 상기 음성 입력 문장에 따른 답변 문장을 생성하여 상기 스피커부를 통해 출력시키는 대화 프로세스를 수행하고, 상기 대화 프로세스를 수행하는 도중에 상기 음성 입력 문장에 임의의 지문(이하, '타겟 지문'이라 함)에 대한 읽기 요구 명령어가 포함되어 있을 경우, 상기 타겟 지문을 상기 스피커부를 통해 출력시키는 읽기 프로세스를 수행한다.The artificial intelligence speaker according to an embodiment of the present invention includes a microphone unit, a speaker unit, and an artificial intelligence unit. The microphone unit receives a voice of an arbitrary user and outputs a voice signal. The speaker unit may output sound. The artificial intelligence unit analyzes the voice signal output from the microphone unit, extracts a voice input sentence, generates a response sentence according to the voice input sentence, and performs a conversation process to output through the speaker unit, and performs the conversation process When a read request command for an arbitrary fingerprint (hereinafter referred to as a 'target fingerprint') is included in the voice input sentence, a read process of outputting the target fingerprint through the speaker unit is performed.

상기 인공지능부는 대화 진행부, 데이터베이스부 및 읽기 진행부를 포함할 수 있다. 상기 대화 진행부는 상기 대화 프로세스를 수행한다. 상기 데이터베이스부는 상기 타겟 지문을 포함하는 적어도 하나의 지문을 저장하고 있다. 상기 읽기 진행부는 상기 대화 프로세스를 수행하는 도중에, 상기 음성 입력 문장에 상기 읽기 요구 명령어가 포함되어 있을 경우, 상기 타겟 지문을 결정한 후, 상기 데이터베이스부로부터 상기 타겟 지문을 제공받은 후, 상기 타겟 지문을 상기 스피커부를 통해 출력시킨다.The artificial intelligence unit may include a conversation progress unit, a database unit, and a reading progress unit. The conversation progress unit performs the conversation process. The database unit stores at least one fingerprint including the target fingerprint. In the middle of performing the conversation process, if the read request command is included in the voice input sentence, the read proceeding unit determines the target fingerprint, receives the target fingerprint from the database unit, and then receives the target fingerprint. Output through the speaker unit.

상기 읽기 진행부는 상기 음성 입력 문장을 이용하여 읽기 모드를 결정한 후, 상기 읽기 모드에 따라 상기 타겟 지문을 상기 스피커부를 통해 출력시킬 수 있다.The reading progress unit may determine a reading mode using the voice input sentence, and then output the target fingerprint through the speaker unit according to the reading mode.

상기 읽기 모드는 상기 타겟 지문을 읽어주는 일반 모드 및 상기 타겟 지문을 읽어주면서 퀴즈도 함께 제시하는 퀴즈 모드 중 어느 하나일 수 있다.The reading mode may be either a general mode in which the target fingerprint is read or a quiz mode in which a quiz is presented while reading the target fingerprint.

상기 읽기 진행부는 상기 타겟 지문을 상기 스피커부를 통해 출력시키고, 상기 타겟 지문이 출력되는 도중에, 상기 음성 입력 문장에 임의의 질문이 포함되어 있을 경우, 상기 질문에 대한 답변을 생성하여 상기 스피커부를 통해 출력시킬 수 있다.The reading proceeding unit outputs the target fingerprint through the speaker unit, and when an arbitrary question is included in the voice input sentence while the target fingerprint is being output, an answer to the question is generated and output through the speaker unit I can do it.

상기 읽기 진행부는 상기 데이터베이스부에 저장된 정보로부터 상기 질문에 대한 답변을 검색하여 추출한 후, 상기 질문에 대한 답변을 상기 스피커부를 통해 출력시킬 수 있다.The reading proceeding unit may search for and extract an answer to the question from information stored in the database unit, and then output the answer to the question through the speaker unit.

상기 읽기 진행부는 상기 데이터베이스부에 저장된 정보에 상기 질문에 대한 답변이 포함되어 있지 않은 경우, 외부 서버에 접속하여 상기 질문에 대한 답변을 검색한 후, 상기 외부 서버로부터 상기 질문에 대한 답변을 수신하여 상기 스피커부를 통해 출력시킬 수 있다.If the reading proceeding unit does not include an answer to the question in the information stored in the database unit, access the external server to search for an answer to the question, and then receive an answer to the question from the external server It can be output through the speaker unit.

상기 타겟 지문은 적어도 하나의 부분 지문 및 상기 부분 지분에 따른 적어도 하나의 퀴즈를 포함할 수 있다.The target fingerprint may include at least one partial fingerprint and at least one quiz according to the partial stake.

상기 읽기 진행부는 상기 부분 지문을 하나씩 상기 스피커부를 통해 출력시키고, 상기 부분 지문이 하나씩 출력될 때마다 상기 퀴즈를 상기 스피커부를 통해 출력시키고, 상기 퀴즈가 출력된 후 상기 음성 입력 문장에 상기 퀴즈에 대한 정답이 포함되어 있는지 여부를 판단하여 정오답 판단 결과를 상기 스피커부를 통해 출력시킬 수 있다.The reading proceeding unit outputs the partial fingerprints one by one through the speaker unit, outputs the quiz through the speaker unit each time the partial fingerprints are output one by one, and after the quiz is output, the voice input sentence for the quiz It is possible to determine whether a correct answer is included and output the result of determining the correct answer through the speaker unit.

상기 읽기 진행부는 상기 부분 지문 모두에 대한 출력이 완료된 경우, 퀴즈 누적 결과를 상기 스피커부를 통해 출력할 수 있다.When the output of all of the partial fingerprints is completed, the read progress unit may output a quiz accumulation result through the speaker unit.

이어서, 본 발명의 일 실시예에 따른 대화 진행 방법은 인공지능 스피커에 의해 진행되는 방법에 관한 것으로, 임의의 사용자의 음성을 입력받아 생성된 음성 신호를 분석하여 음성 입력 문장을 추출한 후 상기 음성 입력 문장에 따른 답변 문장을 생성하여 소리로 출력시키는 대화 프로세스를 수행하는 단계; 상기 대화 프로세스를 수행하는 도중에, 상기 음성 입력 문장에 임의의 지문(이하, '타겟 지문'이라 함)에 대한 읽기 요구 명령어가 포함되어 있는지 여부를 판단하는 단계; 및 상기 음성 입력 문장에 상기 읽기 요구 명령어가 포함되어 있을 경우, 상기 타겟 지문을 소리로 출력시키는 읽기 프로세스를 수행하는 단계를 포함한다.Subsequently, the method for proceeding a conversation according to an embodiment of the present invention relates to a method performed by an artificial intelligence speaker, extracts a voice input sentence by analyzing a voice signal generated by receiving a voice of an arbitrary user, and then inputs the voice Performing a dialogue process of generating a response sentence according to a sentence and outputting it as a sound; Determining whether a read request command for an arbitrary fingerprint (hereinafter referred to as a 'target fingerprint') is included in the voice input sentence while performing the conversation process; And when the read request command is included in the voice input sentence, performing a read process of outputting the target fingerprint as sound.

상기 읽기 프로세스를 수행하는 단계는 상기 음성 입력 문장을 이용하여 상기 타겟 지문을 결정하는 단계; 상기 타겟 지문을 검색하여 로딩하는 단계; 및 상기 타겟 지문을 소리로 출력시키는 단계를 포함할 수 있다.The performing of the reading process may include determining the target fingerprint using the voice input sentence; Searching and loading the target fingerprint; And outputting the target fingerprint as sound.

상기 읽기 프로세스를 수행하는 단계는 상기 음성 입력 문장을 이용하여 읽기 모드를 결정하는 단계를 더 포함할 수 있다. 이때, 상기 타겟 지문을 소리로 출력시키는 단계에서는, 상기 읽기 모드에 따라 상기 타겟 지문을 소리로 출력시킬 수 있다.The step of performing the reading process may further include determining a reading mode using the voice input sentence. At this time, in the step of outputting the target fingerprint as sound, the target fingerprint may be output as sound according to the read mode.

상기 타겟 지문을 소리로 출력시키는 단계는 상기 타겟 지문을 순차적으로 소리로 출력시키는 단계; 상기 타겟 지문이 출력되는 도중에, 상기 음성 입력 문장에 임의의 질문이 포함되어 있는지 여부를 판단하는 단계; 및 상기 음성 입력 문장에 상기 질문이 포함되어 있다고 판단하면, 상기 질문에 대한 답변을 생성하여 소리로 출력시키는 단계를 포함할 수 있다.The step of outputting the target fingerprint as sound may include sequentially outputting the target fingerprint as sound; Determining whether an arbitrary question is included in the voice input sentence while the target fingerprint is being output; And when it is determined that the question is included in the voice input sentence, generating an answer to the question and outputting the sound as a sound.

상기 질문에 대한 답변을 생성하여 소리로 출력시키는 단계에서는, 내부 메모리에 저장된 정보에서 상기 질문에 대한 답변을 검색하여 추출한 후, 상기 질문에 대한 답변을 소리로 출력시킬 수 있다.In the step of generating an answer to the question and outputting it as a sound, an answer to the question may be searched for and extracted from information stored in the internal memory, and then an answer to the question may be output as sound.

상기 질문에 대한 답변을 생성하여 출력시키는 단계에서는, 외부 서버에 접속하여 상기 질문에 대한 답변을 검색한 후, 상기 외부 서버로부터 상기 질문에 대한 답변을 수신하여 소리로 출력시킬 수 있다.In the step of generating and outputting an answer to the question, after connecting to an external server to search for an answer to the question, an answer to the question may be received from the external server and output as a sound.

상기 타겟 지문을 소리로 출력시키는 단계는 상기 부분 지문을 하나씩 순차적으로 소리로 출력시키는 단계; 상기 부분 지문이 하나씩 출력될 때마다 상기 퀴즈를 상기 스피커부를 통해 출력시키는 단계; 및 상기 퀴즈가 출력된 후, 상기 음성 입력 문장에 상기 퀴즈에 대한 정답이 포함되어 있는지 여부를 판단하여 정오답 판단 결과를 소리로 출력시키는 단계를 포함할 수 있다.The step of outputting the target fingerprint as sound may include sequentially outputting the partial fingerprints one by one as sound; Outputting the quiz through the speaker unit whenever the partial fingerprints are output one by one; And after the quiz is output, determining whether the correct answer to the quiz is included in the voice input sentence and outputting the correct answer determination result as sound.

상기 타겟 지문을 소리로 출력시키는 단계는 상기 부분 지문 모두에 대한 출력이 완료된 경우, 퀴즈 누적 결과를 소리로 출력하는 단계를 더 포함할 수 있다.The step of outputting the target fingerprint as sound may further include outputting a quiz accumulation result as sound when output for all of the partial fingerprints is completed.

이와 같이 본 발명에 따른 인공지능 스피커 및 이를 이용한 대화 진행 방법에 따르면, 사용자가 인공지능 스피커와 대화 도중에 타겟 지문에 대한 읽기를 요구하면, 상기 인공지능 스피커가 상기 타겟 지문을 읽어줌에 따라, 상기 사용자는 상기 타겟 지문을 손쉽게 파악할 수 있다.According to the artificial intelligence speaker according to the present invention and a method for proceeding a conversation using the same, if the user requests to read the target fingerprint during a conversation with the artificial intelligence speaker, as the artificial intelligence speaker reads the target fingerprint, the The user can easily grasp the target fingerprint.

또한, 상기 사용자가 상기 타겟 지문을 듣고 있다가 질문을 던지면, 상기 인공지능 스피커가 상기 질문에 대한 답변을 검색하여 출력함에 따라, 상기 사용자는 상기 타겟 지문에 대한 의문을 용이하게 해결할 수 있다.In addition, when the user listens to the target fingerprint and asks a question, as the artificial intelligence speaker searches for and outputs an answer to the question, the user can easily solve the question about the target fingerprint.

또한, 상기 인공지능 스피커가 상기 타겟 지문을 읽어주다가 퀴즈를 제공하고, 상기 사용자가 상기 퀴즈에 대한 답변을 함으로써, 상기 사용자는 상기 타겟 지문에 대한 퀴즈를 통해 좀 더 재미있고 효율적으로 상기 타겟 지문을 손쉽게 파악할 수 있다.In addition, the artificial intelligence speaker reads the target fingerprint and provides a quiz, so that the user answers the quiz, so that the user can make the target fingerprint more interesting and efficient through the quiz on the target fingerprint. Easy to grasp.

도 1은 본 발명의 일 실시예에 따른 인공지능 스피커를 도시한 개념도이다.
도 2는 도 1의 인공지능 스피커를 상세하게 도시한 블록도이다.
도 3은 도 1의 인공지능 스피커에 의한 대화 진행 방법을 설명하기 위한 순서도이다.
도 4는 도 3의 대화 진행 방법 중 읽기 프로세스를 상세하게 설명하기 위한 순서도이다.
도 5는 도 4의 읽기 프로세스 중 일반 모드에 따른 실행을 설명하기 위한 순서도이다.
도 6은 도 4의 대화 진행 방법 중 퀴즈 모드에 따른 실행을 설명하기 위한 순서도이다.
도 7은 도 3의 대화 진행 방법의 일 예를 나타낸 흐름도이다.
도 8은 도 3의 대화 진행 방법의 다른 예를 나타낸 흐름도이다.1 is a conceptual diagram showing an artificial intelligence speaker according to an embodiment of the present invention.
FIG. 2 is a block diagram showing the artificial intelligence speaker of FIG. 1 in detail.
3 is a flow chart for explaining a method of progressing a conversation by the artificial intelligence speaker of FIG. 1.
4 is a flow chart for explaining in detail the reading process of the conversation proceeding method of FIG. 3.
FIG. 5 is a flowchart for explaining execution according to a normal mode in the read process of FIG. 4.
FIG. 6 is a flowchart for explaining execution according to a quiz mode among the method of progressing conversation in FIG. 4.
7 is a flowchart illustrating an example of a method of progressing a conversation in FIG. 3.
8 is a flowchart illustrating another example of the method of progressing conversation in FIG. 3.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 형태를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 본문에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 개시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조 부호를 유사한 구성 요소에 대해 사용하였다. 첨부된 도면에 있어서, 구조물들의 치수는 본 발명의 명확성을 기하기 위하여 실제보다 과장하여 도시한 것일 수 있다. The present invention can be applied to various changes and may have various forms, and specific embodiments will be illustrated in the drawings and described in detail in the text. However, this is not intended to limit the present invention to a specific disclosure form, and it should be understood that it includes all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. In describing each drawing, similar reference numerals are used for similar components. In the accompanying drawings, the dimensions of the structures may be exaggerated than actual ones for clarity of the present invention.

제1, 제2 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성 요소는 제2 구성 요소로 명명될 수 있고, 유사하게 제2 구성 요소도 제1 구성 요소로 명명될 수 있다. Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from other components. For example, the first component may be referred to as a second component without departing from the scope of the present invention, and similarly, the second component may also be referred to as a first component.

본 출원에서 사용한 용어는 단지 특정한 실시예들을 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서에 기재된 특징, 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 또한, A와 B가'연결된다', '결합된다'라는 의미는 A와 B가 직접적으로 연결되거나 결합하는 것 이외에 다른 구성요소 C가 A와 B 사이에 포함되어 A와 B가 연결되거나 결합되는 것을 포함하는 것이다.The terms used in this application are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, terms such as “include” or “have” are intended to indicate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, or that one or more other features or It should be understood that the existence or addition possibilities of numbers, steps, actions, components, parts or combinations thereof are not excluded in advance. Also, A and B are 'connected' and 'joined' means that other components C are included between A and B in addition to A and B being directly connected or joined, so that A and B are connected or combined. It includes things.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다. 또한, 방법 발명에 대한 특허청구범위에서, 각 단계가 명확하게 순서에 구속되지 않는 한, 각 단계들은 그 순서가 서로 바뀔 수도 있다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by a person skilled in the art to which the present invention pertains. Terms such as those defined in a commonly used dictionary should be interpreted as having meanings consistent with meanings in the context of related technologies, and should not be interpreted as ideal or excessively formal meanings unless explicitly defined in the present application. Does not. In addition, in the claims of a method invention, unless each step is clearly bound to the order, the order of each step may be interchanged with each other.

이하, 첨부한 도면들을 참조하여, 본 발명의 바람직한 실시예들을 보다 상세하게 설명하고자 한다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 인공지능 스피커를 도시한 개념도이고, 도 2는 도 1의 인공지능 스피커를 상세하게 도시한 블록도이다.1 is a conceptual diagram illustrating an artificial intelligence speaker according to an embodiment of the present invention, and FIG. 2 is a block diagram illustrating the artificial intelligence speaker of FIG. 1 in detail.

도 1 및 도 2를 참조하면, 본 실시예에 의한 인공지능 스피커(100)는 사용자의 음성을 인식하여 대화를 수행할 수 있는 전자 장치로, 인터넷을 통해 외부 서버(200), 예를 들어 검색 서비스 시스템에 접속하여 정보 검색을 수행할 수 있다. 상기 인공지능 스피커(100)는 하우징부, 마이크부(110), 스피커부(120) 및 인공지능부(130)를 포함할 수 있다. 여기서, 상기 마이크부(110), 상기 스피커부(120) 및 상기 인공지능부(130)는 상기 몸체부에 장착 또는 내장되어 있다.1 and 2, the artificial intelligence speaker 100 according to the present embodiment is an electronic device capable of recognizing a user's voice and conducting a conversation. The external server 200, for example, search through the Internet Information search can be performed by accessing the service system. The artificial intelligence speaker 100 may include a housing unit, a microphone unit 110, a speaker unit 120, and an artificial intelligence unit 130. Here, the microphone unit 110, the speaker unit 120, and the artificial intelligence unit 130 are mounted or embedded in the body unit.

상기 마이크부(110)는 상기 몸체부에 장착되어, 임의의 사용자(이하, '화자'라 함)의 음성을 입력받아 음성 신호를 실시간으로 생성하여 상기 인공지능부(130)로 출력할 수 있다.The microphone unit 110 is mounted on the body unit, and receives a user's voice (hereinafter referred to as a 'speaker') to generate a voice signal in real time and output it to the artificial intelligence unit 130. .

상기 스피커부(120)는 상기 몸체부에 장착되어, 상기 인공지능부(130)에서 출력된 제어 신호에 따라 제어되어 소리로 출력시킬 수 있다.The speaker unit 120 may be mounted on the body unit and controlled according to a control signal output from the artificial intelligence unit 130 to output sound.

상기 인공지능부(130)는 상기 마이크부(110)에서 실시간으로 출력되는 상기 음성 신호를 분석하여 음성 입력 문장을 추출한 후 상기 음성 입력 문장에 따른 답변 문장을 생성하여 상기 스피커부(120)를 통해 소리로 출력시키는 대화 프로세스를 수행할 수 있다. 여기서, 상기 인공지능부(130)는 상기 음성 입력 문장에 임의의 명령어가 포함되어 있을 경우, 상기 명령어에 따른 행위를 수행할 수 있다. 예를 들어, 상기 명령어가 '노래 A의 듣기'인 경우, 상기 노래 A를 검색하여 로딩한 후 상기 노래 A를 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다.The artificial intelligence unit 130 analyzes the voice signal output in real time from the microphone unit 110, extracts a voice input sentence, generates an answer sentence according to the voice input sentence, and generates a response sentence through the speaker unit 120. You can perform a conversation process that outputs sound. Here, the artificial intelligence unit 130 may perform an action according to the command when an arbitrary command is included in the voice input sentence. For example, when the command is 'listen to song A', the song A can be searched and loaded, and then the song A can be output as a sound through the speaker unit 120.

상기 인공지능부(130)는 상기 대화 프로세스를 수행하는 도중에, 실시간으로 추출되는 상기 음성 입력 문장에 임의의 지문(이하, '타겟 지문'이라 함)에 대한 읽기 요구 명령어가 포함되어 있을 경우, 상기 타겟 지문을 검색하여 로딩한 후, 상기 타겟 지문을 상기 스피커부(120)를 통해 출력시키는 읽기 프로세스를 수행할 수 있다. 여기서, 상기 인공지능부(130)는 상기 읽기 프로세스를 수행하는 동안 상기 대화 프로세스를 정지시키거나, 상기 읽기 프로세스 및 상기 대화 프로세스를 함께 수행할 수도 있다.When performing the conversation process, the artificial intelligence unit 130 includes a read request command for an arbitrary fingerprint (hereinafter referred to as a 'target fingerprint') in the voice input sentence extracted in real time. After searching and loading a target fingerprint, a reading process of outputting the target fingerprint through the speaker unit 120 may be performed. Here, the artificial intelligence unit 130 may stop the conversation process while performing the reading process, or may perform the reading process and the conversation process together.

한편, 상기 인공지능 스피커(100)는 사용자의 입력 행위에 의해 입력 정보를 생성하여 상기 인공지능부(130)을 제어할 수 있는 사용자 입력부(미도시), 예를 들어 입력 버튼, 터치패드, 키패드 등을 더 포함할 수도 있다.On the other hand, the artificial intelligence speaker 100 is a user input unit (not shown) capable of controlling the artificial intelligence unit 130 by generating input information according to a user's input action, for example, an input button, a touch pad, and a keypad It may further include.

이하, 상기 인공지능부(130)에 대해 상세하게 설명하고자 한다.Hereinafter, the artificial intelligence unit 130 will be described in detail.

상기 인공지능부(130)는 상기 대화 프로세스를 수행하는 대화 진행부(132), 상기 읽기 프로세스를 수행하는 읽기 진행부(134) 및 각종 정보를 저장하고 있는 데이터베이스부(136)를 포함할 수 있다.The artificial intelligence unit 130 may include a conversation progress unit 132 performing the conversation process, a reading progress unit 134 performing the reading process, and a database unit 136 storing various information. .

상기 대화 진행부(132)는 상기 마이크부(110)에서 실시간으로 제공되는 상기 음성 신호를 실시간으로 분석하여 상기 음성 입력 문장을 추출할 수 있다. 예를 들어, 상기 대화 진행부(132)는 상기 음성 신호를 실시간으로 분석하여, 하나의 문장이 완성될 때마다 상기 음성 입력 문장을 추출할 수 있다.The conversation proceeding unit 132 may extract the voice input sentence by analyzing the voice signal provided in real time from the microphone unit 110 in real time. For example, the conversation progress unit 132 may analyze the voice signal in real time and extract the voice input sentence whenever one sentence is completed.

상기 대화 진행부(132)는 상기 음성 신호를 실시간으로 분석하여, 상기 화자의 목소리를 특정할 수 있는 화자 목소리 파형을 추출한 후, 상기 화자가 누구인지 확정하고 상기 화자의 감정이 어떤 상태인지를 결정할 수도 있다.The conversation proceeding unit 132 analyzes the voice signal in real time, extracts a speaker voice waveform capable of specifying the speaker's voice, determines who the speaker is, and determines how the speaker feels It might be.

상기 대화 진행부(132)는 상기 화자의 대상 및 상태를 고려하여 상기 음성 입력 문장에 따른 답변 문장을 생성한 후, 상기 스피커부(120)를 제어하여 상기 답변 문장을 소리로 출력시킬 수 있다. 여기서, 상기 대화 진행부(132)는 상기 음성 입력 문장에 따른 답변 문장을 생성할 때, 상기 데이터베이스부(136)에 저장되어 있는 정보를 이용하거나 상기 데이터베이스부(136)에 관련 정보가 포함되어 있지 않다고 판단될 때 인터넷을 통해 상기 외부 서버(200)에 접속하여 상기 관련 정보를 검색한 후 상기 음성 입력 문장에 따른 답변 문장을 생성할 수도 있다.The conversation progress unit 132 may generate an answer sentence according to the voice input sentence in consideration of the speaker's object and state, and then control the speaker unit 120 to output the answer sentence as sound. Here, when the dialogue proceeding unit 132 generates an answer sentence according to the voice input sentence, the information stored in the database unit 136 is not used, or the database unit 136 does not include related information. When it is determined that it is not, the external server 200 may be accessed through the Internet to retrieve the relevant information, and an answer sentence according to the voice input sentence may be generated.

상기 읽기 진행부(134)는 상기 대화 진행부(132)에 의해 상기 대화 프로세스가 수행하는 도중에, 실시간으로 추출되는 상기 음성 입력 문장에 상기 타겟 지문에 대한 읽기 요구 명령어가 포함되어 있을 경우, 현재까지 추출된 상기 음성 입력 문장의 누적 정보를 이용하여 상기 타겟 지문을 결정하고, 상기 데이터베이스부(136)에 접속하여 상기 타겟 지문에 대한 정보를 검색한 후 상기 데이터베이스부(136)로부터 상기 타겟 지문을 로딩하고, 이렇게 로딩된 상기 타겟 지문을 상기 스피커부(130)를 통해 소리로 출력시킬 수 있다. 여기서, 상기 읽기 진행부(134)는 상기 데이터베이스부(136)에 상기 타겟 지문에 대한 정보가 검색되지 않을 경우, 상기 외부 서버(200)에 접속하여 상기 타겟 지문에 대한 정보를 검색하여 상기 타겟 지문을 제공받아 로딩할 수도 있다.The reading proceeding unit 134, until the conversation process is performed by the conversation proceeding unit 132, includes a read request command for the target fingerprint in the voice input sentence extracted in real time, until now The target fingerprint is determined by using the accumulated information of the extracted speech input sentence, the database unit 136 is accessed to retrieve information about the target fingerprint, and the target fingerprint is loaded from the database unit 136. And, the target fingerprint thus loaded may be output as sound through the speaker unit 130. Here, when the read proceeding unit 134 does not search for information on the target fingerprint in the database unit 136, accesses the external server 200 to search for information on the target fingerprint and searches for the target fingerprint. You can also receive and load.

상기 읽기 진행부(134)는 현재까지 추출된 상기 음성 입력 문장의 누적 정보를 이용하여 읽기 모드를 결정한 후, 상기 읽기 모드에 따라 상기 타겟 지문을 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다. 여기서, 상기 읽기 모드는 상기 타겟 지문을 읽어주는 일반 모드 및 상기 타겟 지문을 읽어주면서 퀴즈도 함께 제시하는 퀴즈 모드 중 어느 하나일 수 있다.The reading proceeding unit 134 may determine a reading mode using the accumulated information of the voice input sentence extracted so far, and then output the target fingerprint as sound through the speaker unit 120 according to the reading mode. have. Here, the reading mode may be one of a general mode for reading the target fingerprint and a quiz mode for reading the target fingerprint and also presenting a quiz.

상기 읽기 모드가 상기 일반 모드일 경우, 상기 읽기 진행부(134)는 상기 일반 모드에 따라 상기 타겟 지문을 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다.When the read mode is the normal mode, the read progress unit 134 may output the target fingerprint as sound through the speaker unit 120 according to the normal mode.

구체적으로 설명하면, 상기 읽기 진행부(134)는 상기 타겟 지문을 상기 스피커부(120)를 통해 소리로 출력시키고, 상기 타겟 지문이 출력되는 도중에, 실시간으로 추출되는 상기 음성 입력 문장에 임의의 질문이 포함되어 있을 경우, 상기 질문에 대한 답변을 생성하여 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다. 여기서, 상기 읽기 진행부(134)는 상기 데이터베이스부(136)에 저장된 정보로부터 상기 질문에 대한 답변을 검색하여 추출한 후, 상기 질문에 대한 답변을 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다. 반면, 상기 읽기 진행부(134)는 상기 데이터베이스부(136)에 저장된 정보에 상기 질문에 대한 답변이 포함되어 있지 않은 경우, 상기 외부 서버(200)에 접속하여 상기 질문에 대한 답변을 검색한 후, 상기 외부 서버(200)로부터 상기 질문에 대한 답변을 수신하고, 수신된 상기 질문에 대한 답변을 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다.Specifically, the reading progress unit 134 outputs the target fingerprint as a sound through the speaker unit 120, and while the target fingerprint is being output, any question in the voice input sentence extracted in real time If this is included, an answer to the question may be generated and output as sound through the speaker unit 120. Here, the read proceeding unit 134 may search for and extract the answer to the question from the information stored in the database unit 136, and then output the answer to the question as sound through the speaker unit 120. have. On the other hand, if the answer to the question is not included in the information stored in the database unit 136, the read proceeding unit 134 accesses the external server 200 and searches for an answer to the question. , The answer to the question may be received from the external server 200, and the answer to the received question may be output as sound through the speaker unit 120.

상기 읽기 모드가 상기 퀴즈 모드일 경우, 상기 읽기 진행부(134)는 상기 퀴즈 모드에 따라 상기 타겟 지문을 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다. 이때, 상기 타겟 지문은 복수의 부분 지문들 및 상기 부분 지분들 각각에 대응되는 적어도 하나의 퀴즈를 포함할 수 있다.When the reading mode is the quiz mode, the reading progress unit 134 may output the target fingerprint as sound through the speaker unit 120 according to the quiz mode. In this case, the target fingerprint may include a plurality of partial fingerprints and at least one quiz corresponding to each of the partial stakes.

구체적으로 설명하면, 상기 읽기 진행부(134)는 상기 부분 지문들을 하나씩 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다. 또한, 상기 읽기 진행부(134)는 상기 부분 지문들이 하나씩 출력될 때마다 출력되는 지문에 대응되는 퀴즈를 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다. 이후, 상기 읽기 진행부(134)는 실시간으로 추출되는 상기 음성 입력 문장에 상기 퀴즈에 대한 정답이 포함되어 있는지 여부를 판단하여 정오답 판단 결과를 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다. 또한, 상기 읽기 진행부(134)는 상기 부분 지문들 모두에 대한 출력이 완료된 경우, 상기 부분 지문들의 퀴즈에 대한 정오답 판단 결과들을 이용하여 퀴즈 누적 결과를 상기 스피커부(120)을 통해 소리로 출력할 수 있다.Specifically, the reading progress unit 134 may output the partial fingerprints as sound through the speaker unit 120 one by one. In addition, the read proceeding unit 134 may output a quiz corresponding to the fingerprint that is output whenever the partial fingerprints are output one by one through the speaker unit 120 as sound. Thereafter, the reading proceeding unit 134 may determine whether the correct answer to the quiz is included in the voice input sentence extracted in real time and output the result of determining the correct answer as sound through the speaker unit 120. have. In addition, when the output of all of the partial fingerprints is completed, the read proceeding unit 134 uses the result of determining the correct answer to the quiz of the partial fingerprints to generate a quiz accumulation result as a sound through the speaker unit 120. Can print

상기 데이터베이스부(136)는 상기 대화 프로세스 및 상기 읽기 프로세스를 수행하는데 필요한 각종 정보를 저장하고 있을 수 있다. 예를 들어, 상기 데이터베이스부(136)는 상기 타겟 지문을 포함하는 복수의 지문들을 저장하고 있을 수 있다. 여기서, 상기 데이터베이스부(136)는 메모리 장치일 수 있다.The database unit 136 may store various information necessary to perform the conversation process and the reading process. For example, the database unit 136 may store a plurality of fingerprints including the target fingerprint. Here, the database unit 136 may be a memory device.

한편, 상기 데이터베이스부(136)에 저장되어 있는 각종 정보는 상기 화자를 포함하는 복수의 사용자들과 대화를 진행하면서 자동으로 업데이트될 수 있고, 관리자의 업로딩 또는 입력 행위에 의해 수동으로 업데이트될 수도 있다.On the other hand, various information stored in the database unit 136 may be automatically updated while conducting a conversation with a plurality of users including the speaker, or may be manually updated by an uploading or inputting action of an administrator. .

또한, 상기 데이터베이스부(136)에 저장되어 있는 대화 관련 정보는 다양한 종류의 대화 시나리오, 대화 패턴 및 대화 자세에 대한 정보들을 포함할 수 있다. 여기서, 상기 대화 자세는 "그래서 뭐, 어쩔껀데?"라는 고자세와, "죄송합니다. 미안합니다"라는 저자세로 나뉠 수 있는데, 상기 대화 진행부(132)는 상기 마이크부(110) 또는 상기 사용자 입력부를 통한 입력 정보에 의해 상기 대화 자세를 중간자세, 고자세 및 저자세 중 어느 하나로 변경할 수 있다.Also, the conversation-related information stored in the database unit 136 may include various types of conversation scenarios, conversation patterns, and conversation posture information. Here, the conversation posture can be divided into a high posture of "So, what are you going to do?" And a low posture of "I'm sorry. Sorry." The conversation proceeding part 132 is the microphone part 110 or the user input. The dialogue posture may be changed to one of a middle posture, a high posture, and a low posture by inputting information through wealth.

또한, 상기 데이터베이스부(136)는 나이, 성별, 지역(표준어, 지역 사투리)별로 구분된 음성 데이터 정보를 더 저장하고 있을 수도 있다. 따라서, 상기 대화 진행부(132)는 상기 마이크부(110) 또는 상기 사용자 입력부를 통한 입력 정보에 의해 상기 음성 데이터 정보를 이용하여 상기 스피커부(400)에서 출력되는 음성의 나이, 성별, 지역 등을 선택적으로 또는 자동으로 변경시킬 수 있다.Further, the database unit 136 may further store voice data information classified by age, gender, and region (standard language, dialect). Accordingly, the conversation proceeding unit 132 uses the voice data information by the input information through the microphone unit 110 or the user input unit to age, gender, region, etc. of the voice output from the speaker unit 400. Can be changed selectively or automatically.

본 실시예서, 상기 대화 진행부(132) 및 상기 읽기 진행부(134)는 물리적으로 분리된 프로세서일 수도 있으나, 하나의 프로세서에서 구동되는 프로그램들일 수도 있다.In the present exemplary embodiment, the conversation progress unit 132 and the read progress unit 134 may be physically separated processors, but may also be programs running on one processor.

이하, 상기 인공지능 스피커에 의해 수행되는 대화 진행 방법에 대해 상세하게 설명하고자 한다.Hereinafter, a method of progressing a conversation performed by the artificial intelligence speaker will be described in detail.

도 3은 도 1의 인공지능 스피커에 의한 대화 진행 방법을 설명하기 위한 순서도이다.3 is a flow chart for explaining a method of progressing a conversation by the artificial intelligence speaker of FIG. 1.

도 3을 참조하면, 우선, 상기 대화 진행부(132)는 상기 대화 프로세스를 수행할 수 있다(S100).Referring to FIG. 3, first, the conversation proceeding unit 132 may perform the conversation process (S100).

구체적으로 설명하면, 상기 대화 진행부(132)는 상기 마이크부(110)에서 실시간으로 제공되는 상기 음성 신호를 실시간으로 분석하여 상기 음성 입력 문장을 추출할 수 있다. 이때, 상기 대화 진행부(132)는 상기 음성 신호를 실시간으로 분석하여, 상기 화자의 목소리를 특정할 수 있는 화자 목소리 파형을 추출한 후, 상기 화자가 누구인지 확정하고 상기 화자의 감정이 어떤 상태인지를 결정할 수도 있다. 이후, 상기 대화 진행부(132)는 상기 화자의 대상 및 상태를 고려하여 상기 음성 입력 문장에 따른 답변 문장을 생성한 후, 상기 스피커부(120)를 제어하여 상기 답변 문장을 소리로 출력시킬 수 있다.Specifically, the conversation proceeding unit 132 may extract the voice input sentence by analyzing the voice signal provided in real time from the microphone unit 110 in real time. At this time, the conversation proceeding unit 132 analyzes the voice signal in real time, extracts a speaker voice waveform capable of specifying the speaker's voice, determines who the speaker is, and what state the emotion of the speaker is You can also decide Thereafter, the conversation proceeding unit 132 may generate an answer sentence according to the voice input sentence in consideration of the speaker's object and state, and control the speaker unit 120 to output the answer sentence as sound. have.

이어서, 상기 읽기 진행부(134)는 상기 대화 진행부(132)에 의해 상기 대화 프로세스가 수행하는 도중에, 실시간으로 추출되는 상기 음성 입력 문장에 상기 타겟 지문에 대한 읽기 요구 명령어가 포함되어 있는지 여부를 판단할 수 있다(S150). 여기서, 상기 음성 입력 문장에 상기 타겟 지문에 대한 읽기 요구 명령어가 포함되어 있지 않은 경우, 상기 대화 진행부(132)는 상기 대화 프로세스를 계속 수행할 수 있다.Subsequently, the reading proceeding unit 134 determines whether a read request command for the target fingerprint is included in the voice input sentence extracted in real time while the conversation process is performed by the conversation proceeding unit 132. It can be determined (S150). Here, when the read request command for the target fingerprint is not included in the voice input sentence, the conversation proceeding unit 132 may continue to perform the conversation process.

반면, 상기 음성 입력 문장에 상기 타겟 지문에 대한 읽기 요구 명령어가 포함되어 있을 경우, 상기 읽기 진행부(134)는 상기 읽기 프로세스를 수행할 수 있다(S200).On the other hand, when the read request command for the target fingerprint is included in the voice input sentence, the read proceeding unit 134 may perform the read process (S200).

이후, 상기 읽기 진행부(134)는 상기 읽기 프로세스가 수행되는 도중, 상기 읽기 프로세스가 종료되어 있는지 여부를 판단할 수 있다(S250). 여기서, 상기 읽기 프로세스가 종료된 경우, 상기 대화 진행부(132)는 상기 대화 프로세스를 수행할 수 있다. 반면, 상기 읽기 프로세스가 종료되지 않은 경우, 상기 읽기 진행부(134)는 상기 읽기 프로세스를 계속 수행할 수 있다.Thereafter, the read proceeding unit 134 may determine whether the read process is terminated while the read process is being performed (S250). Here, when the reading process is finished, the conversation proceeding unit 132 may perform the conversation process. On the other hand, when the read process is not finished, the read progress unit 134 may continue to perform the read process.

이하, 상기 읽기 진행부(134)에 의해 수행되는 상기 읽기 프로세스에 대해 상세하게 설명하고자 한다.Hereinafter, the read process performed by the read proceeding unit 134 will be described in detail.

도 4는 도 3의 대화 진행 방법 중 읽기 프로세스를 상세하게 설명하기 위한 순서도이다.4 is a flow chart for explaining in detail the reading process of the conversation proceeding method of FIG. 3.

도 4를 참조하면, 우선, 상기 읽기 진행부(134)는 현재까지 추출된 상기 음성 입력 문장의 누적 정보를 이용하여 상기 타겟 지문을 결정할 수 있다(S210).Referring to FIG. 4, first, the reading progress unit 134 may determine the target fingerprint by using the accumulated information of the voice input sentence extracted so far (S210).

이어서, 상기 읽기 진행부(134)는 상기 데이터베이스부(136)에 접속하여 상기 타겟 지문에 대한 정보를 검색한 후 상기 데이터베이스부(136)로부터 상기 타겟 지문을 로딩할 수 있다(S220). 여기서, 상기 읽기 진행부(134)는 상기 데이터베이스부(136)에 상기 타겟 지문에 대한 정보가 검색되지 않을 경우, 상기 외부 서버(200)에 접속하여 상기 타겟 지문에 대한 정보를 검색하여 상기 타겟 지문을 제공받아 로딩할 수도 있다.Subsequently, the read proceeding unit 134 may access the database unit 136 to retrieve information about the target fingerprint, and then load the target fingerprint from the database unit 136 (S220). Here, when the read proceeding unit 134 does not search for information on the target fingerprint in the database unit 136, accesses the external server 200 to search for information on the target fingerprint and searches for the target fingerprint. You can also receive and load.

이어서, 상기 읽기 진행부(134)는 현재까지 추출된 상기 음성 입력 문장의 누적 정보를 이용하여 상기 읽기 모드를 결정할 수 있다(S230). 여기서, 상기 읽기 모드는 상기 일반 모드 및 상기 퀴즈 모드 중 어느 하나일 수 있다. 한편, 상기 S230 단계는 상기 S210 단계 및 상기 S220 단계와 관계없이 이전, 이후 또는 동시에 수행될 수 있다.Subsequently, the reading proceeding unit 134 may determine the reading mode using the accumulated information of the voice input sentence extracted so far (S230). Here, the reading mode may be either the general mode or the quiz mode. Meanwhile, the S230 step may be performed before, after, or simultaneously, regardless of the S210 step and the S220 step.

이어서, 상기 읽기 진행부(134)는 로딩된 상기 타겟 지문을 상기 읽기 모드에 따라 상기 스피커부(130)를 통해 소리로 출력시킬 수 있다(S240).Subsequently, the read progress unit 134 may output the loaded target fingerprint as sound through the speaker unit 130 according to the read mode (S240).

이하, 상기 읽기 모드가 상기 일반 모드인 경우, 상기 읽기 프로세스가 수행되는 과정을 상세하게 설명하고자 한다.Hereinafter, when the read mode is the general mode, a process in which the read process is performed will be described in detail.

도 5는 도 4의 읽기 프로세스 중 일반 모드에 따른 실행을 설명하기 위한 순서도이다.FIG. 5 is a flowchart for explaining execution according to a normal mode in the read process of FIG. 4.

도 5를 참조하면, 우선, 상기 읽기 진행부(134)는 상기 타겟 지문을 상기 스피커부(120)를 통해 소리로 출력시킨다(S10).Referring to FIG. 5, first, the read progress unit 134 outputs the target fingerprint as sound through the speaker unit 120 (S10).

이어서, 상기 읽기 진행부(134)는 상기 타겟 지문이 출력되는 도중에, 실시간으로 추출되는 상기 음성 입력 문장에 임의의 질문이 포함되어 있는지 여부를 판단할 수 있다(S20). 이때, 상기 음성 입력 문장에 상기 질문이 포함되어 있지 않은 경우, 상기 S10 단계가 계속 수행될 수 있다.Subsequently, the reading proceeding unit 134 may determine whether any question is included in the voice input sentence extracted in real time while the target fingerprint is being output (S20). At this time, if the question is not included in the voice input sentence, step S10 may be continuously performed.

반면, 상기 음성 입력 문장에 상기 질문이 포함되어 있을 경우, 상기 읽기 진행부(134)는 상기 질문에 대한 답변을 생성하여 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다. 여기서, 상기 읽기 진행부(134)는 상기 데이터베이스부(136)에 저장된 정보로부터 상기 질문에 대한 답변을 검색하여 추출하거나, 상기 외부 서버(200)에 접속하여 상기 질문에 대한 답변을 검색한 후 제공받을 수 있다.On the other hand, if the question is included in the voice input sentence, the reading proceeding unit 134 may generate an answer to the question and output the sound through the speaker unit 120. Here, the read proceeding unit 134 searches for and extracts the answer to the question from the information stored in the database unit 136, or accesses the external server 200 to search for the answer to the question and provides it. Can receive

이어서, 상기 읽기 진행부(134)는 실시간으로 추출되는 상기 음성 입력 문장에 읽기 종료 명령어가 포함되어 있는지 여부를 판단할 수 있다(S40). 이때, 상기 음성 입력 문장에 상기 읽기 종료 명령어가 포함되어 있지 않을 경우, 상기 S10 단계가 계속 수행될 수 있다. 반면, 상기 음성 입력 문장에 상기 읽기 종료 명령어가 포함되어 있는 경우, 상기 읽기 프로세스는 종료되고 상기 대화 프로세스가 다시 수행될 수 있다.Subsequently, the read proceeding unit 134 may determine whether a read end command is included in the voice input sentence extracted in real time (S40). At this time, if the read end command is not included in the voice input sentence, step S10 may be continuously performed. On the other hand, when the read end command is included in the voice input sentence, the read process is terminated and the conversation process can be performed again.

이하, 상기 읽기 모드가 상기 퀴즈 모드인 경우, 상기 읽기 프로세스가 수행되는 과정을 상세하게 설명하고자 한다.Hereinafter, when the reading mode is the quiz mode, a process in which the reading process is performed will be described in detail.

도 6은 도 4의 대화 진행 방법 중 퀴즈 모드에 따른 실행을 설명하기 위한 순서도이다.FIG. 6 is a flowchart for explaining execution according to a quiz mode among the method of progressing conversation in FIG. 4.

도 6을 참조할 때, 상기 읽기 진행부(134)는 도 5와 같이 상기 일반 모드로 상기 읽기 프로세스를 수행할 수 있다(S10 내지 S40).Referring to FIG. 6, the read proceeding unit 134 may perform the read process in the normal mode as shown in FIG. 5 (S10 to S40).

한편, 상기 읽기 진행부(134)는 상기 S20 단계에서 상기 음성 입력 문장에 상기 질문이 포함되어 있지 않은 경우, 퀴즈 프로세스를 별도로 수행할 수 있다. 이때, 상기 타겟 지문은 복수의 부분 지문들 및 상기 부분 지분들 각각에 대응되는 적어도 하나의 퀴즈를 포함할 수 있다.Meanwhile, if the question is not included in the voice input sentence in step S20, the reading proceeding unit 134 may separately perform a quiz process. In this case, the target fingerprint may include a plurality of partial fingerprints and at least one quiz corresponding to each of the partial stakes.

구체적으로 설명하면, 상기 읽기 진행부(134)는 상기 부분 지문들 중 어느 하나의 지문의 출력이 완료되어 해당 지문에 대한 퀴즈가 존재하는지 여부를 판단한 후(S50), 상기 퀴즈가 존재하지 않을 경우, 상기 부분 지문들을 하나씩 순차적으로 계속 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다.Specifically, the read proceeding unit 134 determines whether or not a quiz for the corresponding fingerprint exists after the output of any one of the partial fingerprints is completed (S50), and when the quiz does not exist , The partial fingerprints may be sequentially output as sound through the speaker unit 120 sequentially.

반면, 상기 S50 단계에서, 상기 퀴즈가 존재할 경우, 상기 읽기 진행부(134)는 상기 퀴즈를 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다(ㄴ60).On the other hand, in the step S50, if the quiz is present, the reading progress unit 134 may output the quiz as sound through the speaker unit 120 (b60).

이어서, 상기 읽기 진행부(134)는 실시간으로 추출되는 상기 음성 입력 문장에 상기 퀴즈에 대한 정답이 포함되어 있는지 여부를 판단하여 정오답 판단 결과를 상기 스피커부(120)를 통해 소리로 출력시킬 수 있다(S70). 한편, 상기 음성 입력 문장에 상기 퀴즈에 대한 정답이 포함되어 있지 않을 경우, 상기 읽기 진행부(134)는 재답변 요청을 상기 스피커부(120)를 통해 소리로 출력하여, 상기 S60 단계 및 상기 S70 단계를 재수행할 수도 있다.Subsequently, the reading proceeding unit 134 determines whether the correct answer to the quiz is included in the voice input sentence extracted in real time, and outputs the result of determining the correct answer as sound through the speaker unit 120. Yes (S70). On the other hand, if the correct answer to the quiz is not included in the voice input sentence, the reading proceeding unit 134 outputs a re-answer request as a sound through the speaker unit 120, in steps S60 and S70. You can also perform the steps again.

이어서, 상기 읽기 진행부(134)는 상기 부분 지문들 모두에 대한 출력이 완료되어 있는지 또는 실시간으로 추출되는 상기 음성 입력 문장에 퀴즈 종료 명령어가 포함되어 있는지 여부를 판단할 수 있다(S80).Subsequently, the read proceeding unit 134 may determine whether output for all of the partial fingerprints is completed or whether a quiz end command is included in the voice input sentence extracted in real time (S80).

이때, 상기 부분 지문들 모두에 대한 출력이 완료되었거나, 실시간으로 추출되는 상기 음성 입력 문장에 상기 퀴즈 종료 명령어가 포함되어 있을 경우, 상기 읽기 진행부(134)는 상기 부분 지문들의 퀴즈에 대한 정오답 판단 결과들을 이용하여 퀴즈 누적 결과를 상기 스피커부(120)을 통해 소리로 출력할 수 있다(S90).At this time, when the output for all of the partial fingerprints is completed or the quiz end command is included in the voice input sentence extracted in real time, the reading proceeding unit 134 answers the correct answer to the quiz of the partial fingerprints. The result of the quiz may be output as sound through the speaker unit 120 using the determination results (S90).

반면, 상기 부분 지문들 모두에 대한 출력이 완료되지 않았고, 실시간으로 추출되는 상기 음성 입력 문장에 상기 퀴즈 종료 명령어가 포함되지 않은 경우, 상기 S10 단계가 다시 수행될 수 있다.On the other hand, when the output of all the partial fingerprints is not completed and the quiz end command is not included in the voice input sentence extracted in real time, step S10 may be performed again.

도 7은 도 3의 대화 진행 방법의 일 예를 나타낸 흐름도이고, 도 8은 도 3의 대화 진행 방법의 다른 예를 나타낸 흐름도이다.FIG. 7 is a flowchart illustrating an example of the method for proceeding conversation in FIG. 3, and FIG. 8 is a flowchart illustrating another example of the method for proceeding conversation in FIG. 3.

도 7 및 도 8을 참조할 때, 상기 인공지능 스피커(100)는 도 7과 같이 상기 일반 모드에 따라 상기 읽기 프로세스를 수행하거나, 도 8과 같이 상기 퀴즈 모드에 따라 상기 읽기 프로세스를 수행할 수 있다.7 and 8, the artificial intelligence speaker 100 may perform the reading process according to the general mode as shown in FIG. 7 or the reading process according to the quiz mode as shown in FIG. 8. have.

이와 같이 본 실시예에 따르면, 사용자가 상기 인공지능 스피커(100)와 대화 도중에 임의의 지문에 대한 읽기를 요구하면, 상기 인공지능 스피커(100)가 해당 지문을 검색하여 로딩한 후 읽어줌에 따라, 상기 사용자는 상기 타겟 지문의 내용을 손쉽게 파악할 수 있다.As described above, according to this embodiment, when a user requests reading of an arbitrary fingerprint during a conversation with the AI speaker 100, the AI speaker 100 searches for the fingerprint and loads it, and then reads it , The user can easily grasp the content of the target fingerprint.

또한, 상기 사용자가 상기 타겟 지문을 듣고 있다가 질문을 던지면, 상기 인공지능 스피커(100)가 상기 질문에 대한 답변을 검색하여 소리로 출력함에 따라, 상기 사용자는 상기 타겟 지문에 대한 의문을 용이하게 해결할 수도 있다.In addition, when the user listens to the target fingerprint and asks a question, as the artificial intelligence speaker 100 searches for an answer to the question and outputs it in sound, the user can easily question the target fingerprint. You can also solve it.

또한, 상기 인공지능 스피커(100)가 상기 타겟 지문을 읽어주다가 퀴즈를 제공할 경우, 상기 사용자가 상기 퀴즈에 대한 답변을 함으로써, 상기 사용자는 상기 타겟 지문에 대한 퀴즈를 통해 좀 더 재미있고 효율적으로 상기 타겟 지문의 내용을 손쉽게 파악할 수 있다.In addition, when the AI speaker 100 reads the target fingerprint and provides a quiz, the user answers the quiz, so that the user can have a more fun and efficient through the quiz on the target fingerprint. The contents of the target fingerprint can be easily grasped.

앞서 설명한 본 발명의 상세한 설명에서는 본 발명의 바람직한 실시예들을 참조하여 설명하였지만, 해당 기술분야의 숙련된 당업자 또는 해당 기술분야에 통상의 지식을 갖는 자라면 후술될 특허청구범위에 기재된 본 발명의 사상 및 기술 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.In the detailed description of the present invention described above, it has been described with reference to preferred embodiments of the present invention, but those skilled in the art or those skilled in the art will appreciate the spirit of the present invention as set forth in the claims below. And it will be understood that various modifications and changes may be made to the present invention without departing from the technical field.

100 : 인공지능 스피커 110 : 마이크부
120 : 스피커부 130 : 인공지능부
132 : 대화 진행부 134 : 읽기 진행부
136 : 데이터베이스부 200 : 외부 서버100: artificial intelligence speaker 110: microphone unit
120: speaker unit 130: artificial intelligence unit
132: conversation progress 134: reading progress
136: database unit 200: external server

Claims

A microphone unit that receives a user's voice and outputs a voice signal;
A speaker unit capable of outputting sound; And
Analyzes the voice signal output from the microphone unit, extracts a voice input sentence, generates an answer sentence according to the voice input sentence, and performs a conversation process to output through the speaker unit, and performs the conversation process during the conversation process. An artificial intelligence speaker including an artificial intelligence unit that performs a reading process of outputting the target fingerprint through the speaker unit when an input sentence includes a read request command for an arbitrary fingerprint (hereinafter referred to as a 'target fingerprint'). .

According to claim 1,
The artificial intelligence unit
A conversation progress unit performing the conversation process;
A database unit storing at least one fingerprint including the target fingerprint; And
When the read request command is included in the voice input sentence during the conversation process, after determining the target fingerprint, after receiving the target fingerprint from the database unit, the target fingerprint is transmitted through the speaker unit Artificial intelligence speaker, characterized in that it comprises a read progress unit to output.

According to claim 2,
The reading progress section
After determining the reading mode using the voice input sentence, the artificial intelligence speaker, characterized in that for outputting the target fingerprint through the speaker unit according to the reading mode.

According to claim 3,
The read mode is
The artificial intelligence speaker characterized in that it is one of a general mode for reading the target fingerprint and a quiz mode for presenting a quiz while reading the target fingerprint.

According to claim 3,
The reading progress section
The target fingerprint is output through the speaker unit,
When the target fingerprint is output, if an arbitrary question is included in the voice input sentence, an artificial intelligence speaker generating an answer to the question and outputting it through the speaker unit.

The method of claim 5,
The reading progress section
After searching and extracting the answer to the question from the information stored in the database unit, the artificial intelligence speaker, characterized in that to output the answer to the question through the speaker unit.

The method of claim 5,
The reading progress section
If the answer to the question is not included in the information stored in the database unit, access the external server to search for the answer to the question, receive the answer to the question from the external server, and receive the answer through the speaker unit. An artificial intelligence speaker characterized by outputting.

According to claim 3,
The target fingerprint
At least one partial fingerprint; And
An artificial intelligence speaker comprising at least one quiz according to the partial stake.

The method of claim 8,
The reading progress section
The partial fingerprints are output one by one through the speaker unit,
When the partial fingerprints are output one by one, the quiz is output through the speaker unit, and after the quiz is output, it is determined whether the correct answer to the quiz is included in the voice input sentence, and the result of the wrong answer determination is determined by the speaker. Artificial intelligence speaker, characterized in that output through the wealth.

The method of claim 9,
The reading progress section
When the output for all of the partial fingerprints is completed, an artificial intelligence speaker characterized by outputting a quiz accumulation result.

In the method of proceeding the conversation conducted by the artificial intelligence speaker,
Performing a conversation process of receiving a voice of an arbitrary user, analyzing a generated voice signal, extracting a voice input sentence, and then generating an answer sentence according to the voice input sentence and outputting it as sound;
Determining whether a read request command for an arbitrary fingerprint (hereinafter referred to as a 'target fingerprint') is included in the voice input sentence while performing the conversation process; And
And if the read request command is included in the voice input sentence, performing a reading process of outputting the target fingerprint as sound.

The method of claim 11,
The step of performing the read process is
Determining the target fingerprint using the voice input sentence;
Searching and loading the target fingerprint; And
And outputting the target fingerprint as sound.

The method of claim 12,
The step of performing the read process is
Further comprising the step of determining the reading mode using the voice input sentence,
In the step of outputting the target fingerprint as sound,
A method of progressing a conversation, characterized in that the target fingerprint is output as sound according to the reading mode.

The method of claim 13,
The read mode is
A method of progressing a conversation, characterized in that it is one of a general mode for reading the target fingerprint and a quiz mode for reading the target fingerprint and also presenting a quiz.

The method of claim 13,
The step of outputting the target fingerprint as sound
Sequentially outputting the target fingerprint as sound;
Determining whether an arbitrary question is included in the voice input sentence while the target fingerprint is being output; And
And if it is determined that the question is included in the voice input sentence, generating an answer to the question and outputting it as a sound.

The method of claim 15,
In the step of generating an answer to the above question and outputting the sound,
After searching for and extracting the answer to the question from the information stored in the internal memory, a method of conducting a conversation, wherein the answer to the question is output as sound.

The method of claim 15,
In the step of generating and outputting an answer to the above question,
After accessing an external server, searching for an answer to the question, receiving the answer to the question from the external server and outputting it as a sound.

The method of claim 14,
The target fingerprint
At least one partial fingerprint; And
Method of dialogue, characterized in that it comprises at least one quiz according to the partial stake.

The method of claim 18,
The step of outputting the target fingerprint as sound
Sequentially outputting the partial fingerprints one by one as sound;
Outputting the quiz through the speaker unit whenever the partial fingerprints are output one by one; And
And after the quiz is output, determining whether the correct answer to the quiz is included in the voice input sentence and outputting a result of determining the correct answer as a sound.

The method of claim 19,
The step of outputting the target fingerprint as sound
And when the output for all of the partial fingerprints is completed, outputting a cumulative result of the quiz.