KR20190026518A

KR20190026518A - Method for operating voice recognition apparatus

Info

Publication number: KR20190026518A
Application number: KR1020170113546A
Authority: KR
Inventors: 박치완; 강광희; 이규동
Original assignee: 엘지전자 주식회사
Priority date: 2017-09-05
Filing date: 2017-09-05
Publication date: 2019-03-13

Abstract

An objective of the present invention is to provide a voice recognition apparatus capable of providing user personalized voice guidance and an operation method thereof. According to one aspect of the present invention to achieve the above-described objective or the other objective, the operation method comprises the steps of: transmitting a voice input signal of a user through a microphone; transmitting voice data corresponding to the voice input signal to a voice recognition server system; identifying the user based on a frequency and intensity of the voice input signal; receiving a response signal based on the voice input signal from the voice recognition server system; and outputting a voice guidance message corresponding to the received response signal. The step of outputting voice guidance message may provide user personalized voice guidance by outputting the voice guidance message with voice based on tone data stored in a database corresponding to an identified user.

Description

[0001] The present invention relates to a speech recognition apparatus,

본 발명은 음성 인식 장치 및 그 동작 방법에 관한 것으로, 더욱 상세하게는 사용자 맞춤형 음성 안내를 제공할 수 있는 음성 인식 장치 및 그 동작 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a speech recognition apparatus and an operation method thereof, and more particularly, to a speech recognition apparatus and a method of operation thereof capable of providing a user-customized speech guidance.

음성 인식 장치는, 음성 인식 기능을 수행하기 위한 장치이다. The speech recognition apparatus is a device for performing a speech recognition function.

한편, 가정이나 사무실 등의 소정 공간에서 사용되는 공기조화기, 세탁기, 청소기 등 홈 어플라이언스(Home appliance)들은 각각 사용자의 조작에 따라 고유의 기능과 동작을 수행하였다. On the other hand, home appliances such as an air conditioner, a washing machine, a vacuum cleaner, and the like used in a predetermined space such as a home or an office perform unique functions and operations according to user's operations.

한편, 홈 어플라이언스의 동작을 위해서, 사용자는, 홈 어플라이언스 본체에 구비된 버튼 등을 직접 조작하거나, 매번 본체로 이동하여 입력하여야 하는 불편함을 피해 리모콘 등의 원격제어장치를 사용할 수 있다. On the other hand, for the operation of the home appliance, the user can use a remote control device such as a remote controller to avoid the inconvenience of directly operating a button or the like provided in the home appliance body or inputting it to the main body every time.

하지만, 리모콘을 사용하는 경우에도, 사용자가 기능별로 조작키를 선택하여 입력해야 하므로, 그 사용이 불편하고, 실내가 어두울 경우 리모콘 및 조작키를 식별하기 위한 별도의 조명이 필요하게 되는 문제점이 있었다. However, even when a remote controller is used, the user has to select and input an operation key for each function, which is inconvenient to use, and when the room is dark, separate illumination for identifying the remote controller and the operation key is required .

따라서, 음성인식 기술을 이용하여 홈 어플라이언스를 제어하는 방안에 대한 연구가 증가하고 있다. Therefore, there is an increasing research on a method of controlling a home appliance using speech recognition technology.

종래 기술 1(공개특허공보 10-1999-00069703호)은 공기조화기용 리모콘이 음성 입력부 및 신호 처리부를 구비하여 음성 인식에 따른 조작 신호를 생성, 전송한다. Prior Art 1 (Laid-Open Patent Publication No. 10-1999-00069703) discloses a remote controller for an air conditioner having an audio input unit and a signal processing unit to generate and transmit an operation signal according to voice recognition.

종래 기술 2(공개특허공보 10-2006-0015092호)는 입력되는 음성신호를 디지털 신호 및 텍스트로 변환 후, 데이터베이스 내 일치하는 제어 명령 존재 여부 확인하고, 일치하는 제어명령이 있으면 공기조화기 내 각 장치를 제어하며, 일치하는 제어명령이 없으면 키워드를 추출하여 연계된 제어명령에 따라 공기조화기 내 각 장치를 제어한다. In the conventional art 2 (Japanese Unexamined Patent Publication (Kokai) No. 10-2006-0015092), an input voice signal is converted into a digital signal and text, and it is checked whether there is a control command in the database. And if there is no matching control command, extracts the keyword and controls each device in the air conditioner according to the associated control command.

하지만, 리모콘 및 공기조화기 등 개별 장치가 구비할 수 있는 시스템 자원에는 한계가 있다. 특히 단순한 몇 가지 단어만 인식하는 것이 아니라 자연어를 인식하기 위해서는 높은 연산량이 요구되어 개별 장치에 장착되는 임베디드 모듈로는 구현이 어렵다. However, there is a limit to the system resources that individual devices such as remote controllers and air conditioners can have. Especially, it is difficult to realize embedded modules that are installed in individual devices because it requires a high computation amount to recognize natural words rather than only a few words.

따라서, 종래 기술 1과 종래 기술 2의 음성 인식 기술은, 전 세계 사용자들의 다양한 자연어 음성 명령을 인식하고 처리하는데 한계가 있었다. Therefore, the speech recognition technology of the related art 1 and the conventional art 2 has a limitation in recognizing and processing various natural language voice commands of users all over the world.

그러므로, 개별 장치의 시스템 자원의 제약 없이 자연어를 인식하고 처리할 수 있고, 홈 어플라이언스를 편리하게 제어할 수 있는 방안이 요구된다.Therefore, there is a need for a method that can recognize and process natural language without restriction of the system resources of the individual devices, and can conveniently control the home appliance.

또한, 사용자의 음성 입력 또는 특정 이벤트에 대응하여 음성으로 안내 메시지를 제공함으로써, 기기 사용에 서툰 사용자가 편리하게 다양한 기능 및 서비스를 이용하는데 도움을 줄 수 있는 방안이 요구된다.In addition, it is required to provide a guidance message by voice in correspondence with a voice input of a user or a specific event, thereby helping a poor user to conveniently use various functions and services.

본 발명의 목적은, 사용자 맞춤형 음성 안내를 제공할 수 있는 음성 인식 장치 및 그 동작 방법을 제공함에 있다.It is an object of the present invention to provide a speech recognition apparatus and a method of operating the same that can provide a user-customized voice guidance.

본 발명의 목적은, 사용자를 식별하고, 식별된 사용자의 음색과 유사한 음성 안내를 제공할 수 있는 음성 인식 장치 및 그 동작 방법을 제공함에 있다.It is an object of the present invention to provide a speech recognition apparatus and a method of operating the same that can identify a user and provide voice guidance similar to the tone of the identified user.

본 발명의 목적은, 서버 시스템과의 통신에 의해 자연어 음성 인식을 수행할 수 있는 음성 인식 장치 및 그 동작 방법을 제공함에 있다.An object of the present invention is to provide a speech recognition apparatus capable of performing speech recognition in natural language by communication with a server system and an operation method thereof.

본 발명의 목적은, 비숙련 사용자들도 편리하게 다양한 기능 및 서비스를 이용할 수 있어 사용자의 이용 편의성을 증대할 수 있는 음성 인식 장치 및 그 동작 방법을 제공함에 있다.It is an object of the present invention to provide a speech recognition apparatus and a method of operating the same that can enhance convenience for users because non-skilled users can conveniently use various functions and services.

상기 또는 다른 목적을 달성하기 위해 본 발명의 일 측면에 따른 음성 인식 장치의 동작 방법은, 마이크를 통하여 사용자의 음성 입력 신호를 수신하는 단계, 음성 입력 신호 대응하는 음성 데이터를 음성 인식 서버 시스템으로 전송하는 단계, 음성 입력 신호의 주파수 및 세기에 기초하여 사용자를 식별하는 단계, 음성 인식 서버 시스템으로부터 음성 입력 신호에 기초한 응답 신호를 수신하는 단계, 및, 수신한 응답 신호에 대응하는 음성 안내 메시지를 출력하는 단계를 포함하고, 음성 안내 메시지 출력 단계는, 식별된 사용자에 대응하여 데이터베이스에 저장된 음색 데이터에 기초하는 음성으로 음성 안내 메시지를 출력함으로써, 사용자 맞춤형 음성 안내를 제공할 수 있다.According to another aspect of the present invention, there is provided a method of operating a speech recognition apparatus, comprising: receiving a user's speech input signal through a microphone; transmitting speech data corresponding to the speech input signal to a speech recognition server system; Identifying a user based on the frequency and intensity of the speech input signal; receiving a response signal based on the speech input signal from the speech recognition server system; and outputting a speech announcement message corresponding to the received response signal And the voice guidance message output step may output a voice guidance message with a voice based on the voice data stored in the database in correspondence with the identified user, thereby providing a user-customized voice guidance.

상기 또는 다른 목적을 달성하기 위해 본 발명의 일 측면에 따른 음성 인식 장치의 동작 방법은, 마이크를 통하여 사용자의 음성 입력 신호를 수신하는 단계, 음성 입력 신호의 주파수 및 세기에 기초하여 사용자를 식별하는 단계, 및, 식별된 사용자에 대응하여 데이터베이스에 저장된 음색 데이터에 기초하는 음성 안내 메시지를 출력하는 단계를 포함함으로써, 식별된 사용자의 음색과 유사한 음성 안내를 제공할 수 있다.According to another aspect of the present invention, there is provided a method of operating a speech recognition device, comprising: receiving a user's speech input signal through a microphone; identifying a user based on the frequency and intensity of the speech input signal; And outputting a voice guidance message based on the voice data stored in the database in correspondence with the identified user, thereby providing voice guidance similar to the voice of the identified user.

본 발명의 실시예들 중 적어도 하나에 의하면, 사용자 맞춤형 음성 안내를 제공할 수 있어, 사용자의 편의성 및 감성 품질을 증대할 수 있다.According to at least one embodiment of the present invention, it is possible to provide a user-customized voice guidance, thereby enhancing the user's convenience and emotional quality.

또한, 본 발명의 실시예들 중 적어도 하나에 의하면, 사용자를 식별하고, 식별된 사용자의 음색과 유사한 음성 안내를 제공할 수 있다.Further, according to at least one of the embodiments of the present invention, it is possible to identify the user and provide voice guidance similar to the tone of the identified user.

또한, 본 발명의 실시예들 중 적어도 하나에 의하면, 서버 시스템과의 통신에 의해 효율적으로 자연어 음성 인식을 수행할 수 있다.Further, according to at least one of the embodiments of the present invention, natural language speech recognition can be efficiently performed by communication with the server system.

또한, 본 발명의 실시예들 중 적어도 하나에 의하면, 비숙련 사용자들도 편리하게 다양한 기능 및 서비스를 이용할 수 있어 사용자의 이용 편의성을 증대할 수 있다.Also, according to at least one of the embodiments of the present invention, non-skilled users can conveniently utilize various functions and services, thereby increasing convenience for the user.

또한, 본 발명의 실시예들 중 적어도 하나에 의하면, 제품별, 개인별로 수집되는 데이터를 이용하여 음성 인식 성능을 향상할 수 있다.In addition, according to at least one of the embodiments of the present invention, speech recognition performance can be improved by using data collected by product and individual.

한편, 그 외의 다양한 효과는 후술될 본 발명의 실시예에 따른 상세한 설명에서 직접적 또는 암시적으로 개시될 것이다.Meanwhile, various other effects will be directly or implicitly disclosed in the detailed description according to the embodiment of the present invention to be described later.

도 1은 본 발명의 일 실시예에 따른 홈 네트워크 시스템을 도시한 것이다.
도 2는 본 발명의 다른 실시예에 따른 홈 네트워크 시스템을 도시한 것이다.
도 3은 본 발명의 일 실시예에 따른 음성 인식 장치를 도시한 사시도이다.
도 4는 음성 인식 장치의 정면도(a)와, (a)에 표시된 A1-A1을 따라 취한 단면도(b)이다.
도 5는 도 4의 일부분을 확대하여 도시한 것이다.
도 6a은 음성 인식 장치의 우측면도이다.
도 6b는 도 6a에 표시된 각 부분에서 취한 그릴의 단면도들이다.
도 7은 음성 인식 장치를 구성하는 주요부들 간의 제어관계를 도시한 블록도이다.
도 8은 커버의 분해 사시도이다.
도 9는 윈도우가 제거된 커버를 도시한 것이다.
도 10은 음성입력 PCB가 윈도우 서포트에 결합되기 전 상태의 분해도이다.
도 11은 음성입력 PCB가 윈도우 서포트에 결합된 상태의 단면도이다.
도 12는 본 발명의 일 실시예에 따른 음성 인식 서버 시스템 및 음성 인식 장치를 포함하는 스마트 홈 시스템을 간략히 도시한 도면이다.
도 13a는 본 발명의 일 실시예에 따른 음성 인식 서버 시스템의 일예이다.
도 13b는 본 발명의 일 실시예에 따른 음성 인식 서버 시스템의 일예이다.
도 14는 본 발명의 일 실시예에 따른 음성 인식 장치의 내부 블록도의 일예를 도시한 도면이다.
도 15는 본 발명의 일 실시예에 따른 음성 인식 장치의 동작 방법을 도시한 순서도이다.
도 16은 본 발명의 일 실시예에 따른 음성 인식 장치의 동작 방법을 도시한 순서도이다.
도 17은 본 발명의 일 실시예에 따른 음성 인식 장치의 동작 방법을 도시한 순서도이다.
도 18은 본 발명의 일 실시예에 따른 음성 인식 장치의 동작 방법을 도시한 순서도이다.
도 19는 일 실시예에 따른 음성 인식 장치의 동작 방법에 관한 설명에 참조되는 도면이다.1 illustrates a home network system according to an embodiment of the present invention.
2 illustrates a home network system according to another embodiment of the present invention.
3 is a perspective view illustrating a speech recognition apparatus according to an embodiment of the present invention.
4 is a front view (a) of the speech recognition apparatus and a sectional view (b) taken along line A1-A1 shown in (a).
5 is an enlarged view of a portion of FIG.
6A is a right side view of the voice recognition device.
Fig. 6B is a cross-sectional view of the grill taken at each part shown in Fig. 6A.
7 is a block diagram showing the control relationship between the main parts constituting the speech recognition apparatus.
8 is an exploded perspective view of the cover.
Fig. 9 shows a cover in which a window is removed.
10 is an exploded view of the state before the voice input PCB is coupled to the window support.
11 is a sectional view of the state in which the voice input PCB is coupled to the window support.
12 is a view schematically illustrating a smart home system including a voice recognition server system and a voice recognition apparatus according to an embodiment of the present invention.
13A is an example of a speech recognition server system according to an embodiment of the present invention.
13B is an example of a speech recognition server system according to an embodiment of the present invention.
FIG. 14 is a block diagram of an internal block diagram of a speech recognition apparatus according to an embodiment of the present invention. Referring to FIG.
15 is a flowchart illustrating an operation method of a speech recognition apparatus according to an embodiment of the present invention.
16 is a flowchart illustrating an operation method of a speech recognition apparatus according to an embodiment of the present invention.
17 is a flowchart illustrating an operation method of a speech recognition apparatus according to an embodiment of the present invention.
18 is a flowchart illustrating an operation method of a speech recognition apparatus according to an embodiment of the present invention.
FIG. 19 is a diagram for explaining an operation method of a speech recognition apparatus according to an embodiment.

이하에서는 첨부한 도면을 참조하여 본 발명의 실시예를 상세하게 설명한다. 그러나 본 발명이 이러한 실시예에 한정되는 것은 아니며 다양한 형태로 변형될 수 있음은 물론이다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, it is needless to say that the present invention is not limited to these embodiments and can be modified into various forms.

도면에서는 본 발명을 명확하고 간략하게 설명하기 위하여 설명과 관계없는 부분의 도시를 생략하였으며, 명세서 전체를 통하여 동일 또는 극히 유사한 부분에 대해서는 동일한 도면 참조부호를 사용한다. In the drawings, the same reference numerals are used for the same or similar parts throughout the specification.

한편, 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 단순히 본 명세서 작성의 용이함만이 고려되어 부여되는 것으로서, 그 자체로 특별히 중요한 의미 또는 역할을 부여하는 것은 아니다. 따라서, "모듈" 및 "부"는 서로 혼용되어 사용될 수도 있다. The suffix " module " and " part " for components used in the following description are given merely for convenience of description and do not give special significance or role in themselves. Thus, " module " and " part " may be used interchangeably.

도 1은 본 발명의 일 실시예에 따른 네트워크 시스템을 도시한 것이다. 1 illustrates a network system according to an embodiment of the present invention.

네트워크 시스템은, 가정이나 사무실 등의 일정한 공간 내에서 상호 통신함으로써 네트워크를 구축하는 기기들의 집합체이다. 이러한 네트워크 시스템의 일 실시예로써, 도 1은 가정 내에 구축된 홈 네트워크 시스템을 도시하고 있다. A network system is a collection of devices that establish a network by mutual communication within a certain space such as a home or an office. As an example of such a network system, FIG. 1 shows a home network system built in a home.

이하, 장치(1)는 음향 출력 기능을 갖춘 통신 네트워크용 음성 인식 장치(Hub, 1)를 예로 드나, 반드시 이에 한정되어야 하는 것은 아니다. 관점에 따라서는 장치(1)를 음향출력장치라고 칭할 수도 있음을 명시한다. Hereinafter, the device 1 exemplifies a voice recognition device (Hub) 1 for a communication network having an acoustic output function, but is not necessarily limited thereto. Depending on the viewpoint, the apparatus 1 may be referred to as an audio output device.

도 1을 참조하면, 본 발명의 일 실시예에 따른 네트워크 시스템은 액세서리(accessary, 2, 3a, 3b), 게이트웨이(gateway, 4), 액세스 포인트(Access Point, 7) 및 음성 인식 장치(1) 또는 음향출력장치를 포함할 수 있다. 1, a network system according to an exemplary embodiment of the present invention includes an accessory 2, 3a, 3b, a gateway 4, an access point 7, Or a sound output device.

액세서리(2, 3a, 3b), 게이트웨이(4), 액세스 포인트(7) 및/또는 음성 인식 장치(1)는 정해진 통신규약(protocol)에 따라 상호 통신이 가능하고, 이러한 통신은 와이파이(Wi-Fi), 이더넷(Ethernet), 직비(Zigbee), 지-웨이브(Z-wave), 블루투스(Bluetooth) 등의 기술을 기반으로 이루어질 수 있다. The accessory 2, 3a, 3b, the gateway 4, the access point 7 and / or the speech recognition device 1 can communicate with each other according to a predetermined communication protocol, Fi, Ethernet, Zigbee, Z-wave, Bluetooth, and the like.

와이파이(Wi-fi)는 본래 와이파이 얼라이언스(Wi-Fi Alliance)의 상표명이나, 무선통신 기술로써 통용되고 있는 용어로써, 무선 랜(WLAN) 규격(IEEE 802.11)에서 정한 제반 규정에 따라 장치들 간의 무선랜 연결과, 장치 간 연결(와이파이 P2P), 들 간의 무선랜 연결과, PAN/LAN/WAN 구성 등을 지원하는 일련의 기술을 뜻한다. 이하, "와이파이 모듈"은 와이파이 기술을 기반으로 무선 통신을 하는 기기로 정의한다. Wi-fi is a trademark of the Wi-Fi Alliance. It is a term commonly used as a wireless communication technology. It is a term used for wireless communication between devices in accordance with the general rules defined by the WLAN standard (IEEE 802.11) Refers to a set of technologies that support LAN connectivity, device-to-device connectivity (Wi-Fi P2P), wireless LAN connectivity, and PAN / LAN / WAN configurations. Hereinafter, " Wi-Fi module " is defined as a wireless communication device based on Wi-Fi technology.

이더넷(Ethernet)은, 국제전자공학협회(IEEE)의 802.3 표준에 따른 네트워킹 기술로서, 근거리 통신망(LAN) 하드웨어, 프로토콜, 케이블의 가장 대표적인 표준이다. 이더넷은 데이터 전송을 위해 주로 CSMA/CD(carrier sense multiple access with collision detection) 방식을 사용한다. 이하, "이더넷 모듈"은 이더넷 기술을 기반으로 통신을 하는 기기로 정의한다. Ethernet is a networking technology according to the IEEE 802.3 standard, and is the most representative standard for local area network (LAN) hardware, protocols, and cables. Ethernet uses mainly CSMA / CD (Carrier Sense Multiple Access with Collision Detection) for data transmission. Hereinafter, " Ethernet module " is defined as a device that performs communication based on Ethernet technology.

직비(Zigbee)는 소형, 저전력 디지털 라디오를 이용해 개인 통신망을 구성하여 통신하기 위한 무선 네트워크 기술로써, IEEE 802.15에서 정한 규정에 따른 통신 방식이다. 작은 크기로 전력 소모량이 적고 값이 싸 홈 네트워크 등 유비쿼터스 구축 솔루션으로 각광받고 있으며 지능형 홈 네트워크, 빌딩 등의 근거리 통신 시장과 산업용 기기 자동화, 물류, 휴먼 인터페이스, 텔레매틱스, 환경 모니터링, 군사 등에 활용된다. Zigbee is a wireless network technology for configuring and communicating a private communication network using a small, low-power digital radio. It is a communication method according to the regulations defined in IEEE 802.15. It is small size, low power consumption, low cost, and it is widely used as a ubiquitous building solution such as home network. It is used in the near field communication market of intelligent home network, building, industrial automation, logistics, human interface, telematics, environmental monitoring and military.

직비 프로토콜은 물리 계층, 미디어 액세스 제어(MAC) 계층, 네트워크 계층, 그리고 어플리케이션 계층으로 이루어져 있다. 직비의 물리 계층과 MAC 계층은 IEEE 802.15.4 표준에 정의되어 있다. The mobility protocol consists of a physical layer, a media access control (MAC) layer, a network layer, and an application layer. The physical layer and the MAC layer are defined in the IEEE 802.15.4 standard.

직비 네트워크 계층은 트리 구조와 메쉬 구조를 위한 라우팅과 어드레싱를 지원하고 있으며, 어플리케이션 프로파일로는 ZigBee Home Automation Public Profile과 ZigBee Smart Energy Profile이 대표적으로 사용된다. 또 새로운 직비 사양인 RF4CE는 가전의 원격 제어를 위한 솔루션과 스타 토폴로지를 위한 간단한 네트워크 스택을 정의하고 있는데, RF4CE는 2.4GHz의 주파수 대역을 사용하고 AES-128을 이용한 보안을 제공한다. ZigBee Home Network Public Profile and ZigBee Smart Energy Profile are used as the application profiles. The RF4CE, a new off-the-shelf specification, defines a simple network stack for star topology and a solution for remote control of appliances. RF4CE uses 2.4GHz frequency band and provides security using AES-128.

직비는 낮은 수준의 전송 속도로도 충분하면서 긴 베터리 수명과 보안성을 요구하는 분야에서 주로 사용되며, 주기적 또는 간헐적인 데이터 전송이나 센서 및 입력 장치 등의 단순 신호 전달을 위한 데이터 전송에 적합하다. 응용 분야에는 무선 조명 스위치, 가내 전력량계, 교통 관리 시스템, 그 밖에 근거리 저속 통신을 필요로 하는 개인 및 산업용 장치 등이 있다. 직비는 블루투스나 와이파이 같은 다른 WPAN 기술에 비해 상대적으로 더 단순하고 저렴하다는 장점이 있다. 이하, "직비 모듈"은 직비 기술을 기반으로 무선 통신을 실시하는 기기로 정의한다. Low power transmission rate is sufficient for long battery life and security, and it is suitable for periodic or intermittent data transmission or data transmission for simple signal transmission such as sensors and input devices. Applications include wireless lighting switches, household watt-hour meters, traffic management systems, and other personal and industrial devices that require near-field low-speed communication. The advantage is that it is relatively simple and inexpensive compared to other WPAN technologies such as Bluetooth and Wi-Fi. Hereinafter, the " occupancy module " is defined as a device that performs wireless communication based on the occupancy technology.

지웨이브(Z-wave)는 가정 자동화와 센서 네트워크와 같은 저전력과 저대역폭을 요구하는 장치를 위해 설계된 무선 전송 방식으로써, 무선 네트워크에서 하나 이상의 노드들과 제어 유니트 사이에서 신뢰성 있는 통신을 제공하는 것을 주 목적으로 한다. 지웨이브는 물리 계층, 미디어 액세스 제어(MAC) 계층, 전송 계층, 라우팅 계층, 그리고 어플리케이션 계층으로 구성되어 있으며, 900MHz 대역(유럽: 869MHz, 미국: 908MHz)과 2.4GHz 대역을 사용하면서 9.6kbps, 40kbps, 그리고 200kbps의 속도를 제공한다. 이하, "지웨이브 모듈"은 지웨이브 기술을 기반으로 무선 통신을 실시하는 기기로 정의한다. Z-wave is a wireless transmission scheme designed for devices requiring low power and low bandwidth, such as home automation and sensor networks, to provide reliable communication between one or more nodes and control units in a wireless network. Main purpose. JiWave is composed of physical layer, media access control (MAC) layer, transport layer, routing layer and application layer. It uses 9.6kbps and 40kbps (900MHz band, Europe: 869MHz, USA: 908MHz) , And a speed of 200kbps. Hereinafter, " geo-wave module " is defined as a device that performs wireless communication based on geo-wave technology.

액세서리(2)는 사용자가 원하는 임의의 위치에 설치가 가능하며, 온도 센서, 습도 센서, 진동 센서, 근접 센서, 적외선(IR: Infrared) 센서 등의 각종 센서를 구비할 수 있다. 이들 센서들에 의해 획득된 정보는 네트워크를 통해 음성 인식 장치(1)로 전송될 수 있으며, 역으로, 음성 인식 장치(1)로부터 상기 센서들의 제어를 위한 신호가 액세서리(2)로 전송되는 것도 가능하다. The accessory 2 may be installed at any position desired by the user and may include various sensors such as a temperature sensor, a humidity sensor, a vibration sensor, a proximity sensor, and an infrared (IR) sensor. The information obtained by these sensors can be transmitted to the voice recognition device 1 via the network and vice versa that signals for the control of the sensors from the voice recognition device 1 are transmitted to the accessory 2 It is possible.

또한, 액세서리(2)는 주변에 위치한 가전기기의 원격제어가 가능하도록 구성될 수 있다. 예를 들어, 액세서리(2)는 네트워크를 통해 전송된 제어 신호에 따라 적외선 신호를 발신하는 발신 장치를 포함할 수 있다. Also, the accessory 2 can be configured to be able to remotely control peripheral equipment located nearby. For example, the accessory 2 may include an originating device that emits an infrared signal in accordance with a control signal transmitted over the network.

한편, 적외선 센서는 적외선을 조사하는 발신부와, 상기 발신부로부터 조사된 적외선이 물체에 맞고 반사된 경우에 이를 수신하는 수신부를 포함할 수 있다. On the other hand, the infrared sensor may include a transmitter for irradiating infrared rays and a receiver for receiving infrared rays irradiated from the transmitter when the infrared rays hit the object and are reflected.

액세스 포인트(Access Point, 7)는 무선장비를 네트워크에 연결할 수 있도록 중계하는 장치로써, 홈 네트워크를 인터넷과 연결한다. 가전기기(5)와, 음성 인식 장치(1), 액세서리(3b) 등은 액세스 포인트(7)와 유선(예를 들어, 이더넷) 또는 무선(예를 들어, 와이파이)으로 연결될 수 있다. The access point 7 is a device that relays the wireless device to a network so as to connect the home network with the Internet. The home appliance 5, the voice recognition device 1, the accessory 3b and the like may be connected to the access point 7 by wire (e.g., Ethernet) or wireless (e.g., Wi-Fi).

게이트웨이(4)는 프로토콜이 다른 네트워크들이 서로 정보를 주고 받을 수 있도록, 이들 네트워크들을 서로 연결하는 장치이다. 예를 들어, 액세서리(2, 3b)로부터 수신된 직비(또는, 지웨이브) 방식의 신호를 와이파이 방식의 신호로 변환함으로써, 액세서리(2, 3b)와 액세스 포인트(7)를 중계할 수 있다. The gateway 4 is a device for connecting these networks to each other so that other networks can exchange information with each other. For example, it is possible to relay the accessories 2 and 3b and the access point 7 by converting a signal of a lengthwise (or gyro-wave) type received from the accessory 2 or 3b into a signal of a wifi type.

한편, 홈 네트워크 시스템은 액세스 포인트(7)를 통해 인터넷과 접속이 가능하며, 이를 통해 인터넷을 통해 서비스를 제공하는 서버(8)와 접속될 수 있다. 서버(또는, 클라우드(cloud), 8)는 액세서리(2, 3a, 3b) 및/또는 음성 인식 장치(1)를 제조하는 제조자, 판매하는 판매자 또는 상기 제조자 또는 판매자와 계약된 서비스 제공자에 의해 관리될 수 있다. 서버(8)는 소프트웨어와 데이터를 저장하며, 상기 데이터는 홈네트워크로부터 받은 것일 수 있다. 음성 인식 장치(1)로부터 요청이 있을 시, 서버(8)는 저장된 소프트웨어나 데이터를 인터넷을 통해 홈 네트워크로 전송할 수 있다. On the other hand, the home network system can be connected to the Internet through the access point 7, and can be connected to the server 8 providing the service through the Internet. A server (or a cloud) 8 is managed by a manufacturer, a seller or a service provider contracted with the manufacturer (2), (3a, 3b) and / . The server 8 stores software and data, which may be received from a home network. When there is a request from the speech recognition apparatus 1, the server 8 can transmit the stored software or data to the home network via the Internet.

서버(8)는 인터넷에 접속된 PC(personal computer), 스마트 폰(smart phone) 등의 이동 단말기(mobile terminal)와도 정보를 주고 받을 수 있다. 음성 인식 장치(1)나 액세서리(2, 3a, 3b)로부터 전송된 정보는 서버(8)에 저장될 수 있으며, 이러한 정보들은 서버(8)와 접속된 이동 단말기(6)로 전송될 수 있다. 또한, 이동 단말기(6)로부터 전송된 정보 역시 서버(8)를 경유하여 음성 인식 장치(1)나 액세서리(2, 3a, 3b)로 전송될 수 있으며, 따라서, 이동 단말기(6)를 통해 음성 인식 장치(1)나 악세서리(2, 3a, 3b)를 제어하는 것도 가능하다. The server 8 can exchange information with a mobile terminal such as a PC (personal computer) or a smart phone connected to the Internet. Information transmitted from the speech recognition device 1 or the accessories 2, 3a and 3b can be stored in the server 8 and these information can be transmitted to the mobile terminal 6 connected to the server 8 . The information transmitted from the mobile terminal 6 can also be transmitted to the voice recognition device 1 or the accessory 2, 3a, 3b via the server 8, It is also possible to control the recognition device 1 or accessories 2, 3a, 3b.

이동 단말기(6)의 일종으로써, 최근에 널리 이용되고 있는 스마트폰(smart phone)은 그래픽 기반의 편리한 UI를 제공하기 때문에, 상기 UI를 통해 액세서리(2, 3a, 3b)를 제어하거나, 액세서리(2, 3a, 3b)로부터 수신한 정보를 가공하여 표시하는 것이 가능하다. 또한, 스마트폰에 탑재된 어플리케이션(application)을 업데이트함으로써, 액세서리(2, 3a, 3b)를 통해 구현 가능한 기능을 확장 또는 변경할 수도 있다. 그러나, 이동 단말기(6)를 활용하지 않고, 음성 인식 장치(1) 만으로도 액세서리(2, 3a, 3b)를 제어하거나, 액세서리(2, 3a, 3b)로부터 수신한 정보를 가공하여 표시하는 것도 가능하다. As a kind of mobile terminal 6, since a smart phone widely used recently provides a convenient UI based on a graphic, it is possible to control accessories 2, 3a and 3b through the UI, 2, 3a, 3b) can be processed and displayed. In addition, by updating an application installed in the smartphone, functions that can be implemented through the accessory 2, 3a, 3b can be expanded or changed. However, it is also possible to control the accessories 2, 3a, and 3b using the voice recognition device 1 alone or to process and display information received from the accessories 2, 3a, and 3b without using the mobile terminal 6 Do.

게이트웨이(4)와 액세스 포인트(7)를 매개로, 음성 인식 장치(1)와 액세서리(2, 3a, 3b) 상호 간의 통신이 이루어질 수 있으며, 구체적으로, 액세서리(2, 3b)로부터 출력된 신호(또는, 정보)가 게이트웨이(4)와 액세스 포인트(7)를 차례로 경유하여 음성 인식 장치(1)로 전송되며, 반대로, 음성 인식 장치(1)로부터 출력된 정보는 액세스 포인트(7)와 게이트웨이(4)를 차례로 경유하여 액세서리(2, 3b)로 전송될 수 있다. 실시예에 따라, 액세서리(2, 3a, 3b)와 음성 인식 장치(1) 간의 통신은, 네트워크가 인터넷과 단절된 경우에도 가능하다. The communication between the voice recognition apparatus 1 and the accessory 2, 3a, 3b can be performed via the gateway 4 and the access point 7. Specifically, the signal output from the accessory 2, (Or information) is transmitted to the voice recognition device 1 via the gateway 4 and the access point 7 in turn. Conversely, the information output from the voice recognition device 1 is transmitted to the access point 7, (4) to the accessory (2, 3b) in turn. According to the embodiment, the communication between the accessory 2, 3a, 3b and the speech recognition device 1 is possible even when the network is disconnected from the Internet.

이상에서 설명한 액세서리들(2, 3a, 3b) 이외에도, 실시예에 따라 다양한 종류의 액세서리가 제공될 수 있다. 예를 들어, 액세서리는 공기의 질을 감지하는 공기질 센서, 스마트 플러그, CT센서, nest 온도조절기, 수면센서 등을 포함하여 구성될 수 있다. In addition to the accessories 2, 3a, 3b described above, various kinds of accessories may be provided according to the embodiment. For example, the accessory can be configured to include an air quality sensor that senses the quality of the air, a smart plug, a CT sensor, a nest temperature controller, a sleep sensor, and the like.

액세서리는 가전기기(5)에 부착될 수 있다. 예를 들어, 진동센서를 구비한 액세서리를 세탁기에 부착하여, 세탁기 작동 중에 발생하는 진동을 감지할 수 있으며, 감지된 진동에 따라 상기 진동센서로부터 출력된 신호가 네트워크로 전송될 수 있다. The accessory may be attached to the home appliance (5). For example, an accessory having a vibration sensor may be attached to a washing machine to detect vibrations generated during operation of the washing machine, and a signal output from the vibration sensor may be transmitted to the network according to the sensed vibration.

이에 한하지 않고, 액세서리는 가전기기(5) 이외의 장소에 부착될 수도 있다. 예를 들어, 주거 내 도어의 개폐를 감지하고자 하는 경우, 동작 감지 센서(예를 들어, 적외선 센서)를 구비한 액세서리를 벽면에 부착하여, 상기 도어의 개폐 동작을 감지할 수 있다. 장기간 상기 주거 내 도어의 개폐 동작이 감지되지 않은 경우에는, 거주인에게 변고가 발생되었을 가능성이 있는 바, 이러한 사정이 기 설정된 이동 단말기(6)로 통보될 수 있다. However, the accessory may be attached to a place other than the home appliance 5. For example, when an attempt is made to detect the opening and closing of a door in a residence, an accessory having a motion detection sensor (for example, an infrared ray sensor) may be attached to the wall to detect opening and closing operations of the door. If the opening and closing operation of the door in the dwelling is not detected for a long period of time, there is a possibility that the dwelling person has been changed, and such a situation may be notified to the predetermined mobile terminal 6.

더 나아가, 상기 동작 감지 센서를 구비한 액세서리를 이용하여, 냉장고 도어의 개폐 동작을 감지하고, 장기간 상기 냉장고 도어의 개폐 동작이 감지되지 않은 경우에는, 거주인에게 변고가 발생되었을 가능성이 있는 바, 이러한 사정이 기 설정된 이동 단말기(6)로 통보될 수 있다. Further, when the opening / closing operation of the refrigerator door is detected by using the accessory having the motion detection sensor and the opening / closing operation of the refrigerator door is not detected for a long time, there is a possibility that the inhabitant is changed. This situation can be notified to the predetermined mobile terminal 6.

이와 같은 여러 실시예들에서, 액세서리로부터 네트워크를 통해 전송된 신호는 이동 단말기(6)에 의해 수신될 수 있으며, 이동 단말기(6)에 탑재된 어플리케이션이 수신된 신호를 분석하여, 가전기기(5)의 작동 상태(예를 들어, 세탁기의 언밸런스 발생)나 도어의 개폐 정보를 파악할 수 있으며, 이러한 정보 또는 정보를 가공하여 도출된 결과(예를 들어, 세탁기의 비정상 동작을 알리는 경고, 또는, 장기간 도어가 개폐되지 않았는 바, 거주자의 신상을 확인할 것을 요구하는 알림 등)가 이동 단말기(6)의 디스플레이나 스피커를 통해 표시될 수 있다. In such embodiments, a signal transmitted over the network from the accessory may be received by the mobile terminal 6, and the application loaded on the mobile terminal 6 may analyze the received signal and transmit the signal to the home appliance 5 (For example, an unbalance of the washing machine) and door opening / closing information of the washing machine (for example, a warning indicating the abnormal operation of the washing machine, A notification that the door is not opened or closed, a request for confirming the occupant's personal identity, etc.) may be displayed through the display of the mobile terminal 6 or the speaker.

한편, 음성 인식 장치(1)는 마이크(미도시)를 포함할 수 있고, 상기 탑재된 음성 인식 프로그램에 따라 상기 마이크를 통해 입력된 음성으로부터 명령을 추출하고, 그에 따른 제어를 수행할 수 있다. On the other hand, the speech recognition apparatus 1 may include a microphone (not shown), and may extract commands from the speech inputted through the microphone in accordance with the installed speech recognition program and perform control accordingly.

도 2는 본 발명의 다른 실시예에 따른 홈 네트워크 시스템을 도시한 것이다. 2 illustrates a home network system according to another embodiment of the present invention.

본 발명의 다른 실시예 따른 홈 네트워크 시스템은, 전술한 실시예와 비교할 시, 게이트웨이(4)가 구비되지 않으며, 게이트웨이(4)가 수행하던 기능을 음성 인식 장치(1)가 겸한다는 점에 있어서 차이가 있고, 그 이외의 특징들은 전술한 실시예의 경우와 실질적으로 동일하다. The home network system according to another embodiment of the present invention is different from the above embodiment in that the gateway 4 is not provided and the voice recognition device 1 also functions as the gateway 4 And the other features are substantially the same as in the case of the above-described embodiment.

액세서리(2, 3b)는 게이트웨이(4, 도 1 참조.)를 공유하지 않고, 음성 인식 장치(1)와 직접 통신이 가능하다. 바람직하게는, 액세서리(2, 3b)와 음성 인식 장치(1)는 직비 방식을 통해 통신하며, 이 경우, 액세서리(2, 3b)와 음성 인식 장치(1)에는 각각 직비 모듈이 구비될 수 있다. The accessories 2 and 3b can communicate directly with the speech recognition device 1 without sharing the gateway 4 (see Fig. 1). Preferably, the accessory 2, 3b and the voice recognition device 1 communicate via the directivity method, in which case the accessory 2, 3b and the speech recognition device 1 may each be equipped with a directivity module .

도 3은 본 발명의 일 실시예에 따른 음성 인식 장치를 도시한 사시도이다. 도 4는 음성 인식 장치의 정면도(a)와, (a)에 표시된 A1-A1을 따라 취한 단면도(b)이다. 도 5는 도 4의 일부분을 확대하여 도시한 것이다. 도 6a은 음성 인식 장치의 우측면도이다. 도 6b는 도 6a에 표시된 각 부분에서 취한 그릴의 단면도들이다. 도 7은 음성 인식 장치를 구성하는 주요부들 간의 제어관계를 도시한 블록도이다. 도 8은 커버의 분해 사시도이다. 도 9는 윈도우가 제거된 커버를 도시한 것이다. 도 10은 음성입력 PCB가 윈도우 서포트에 결합되기 전 상태의 분해도이다. 도 11은 음성입력 PCB가 윈도우 서포트에 결합된 상태의 단면도이다. 3 is a perspective view illustrating a speech recognition apparatus according to an embodiment of the present invention. 4 is a front view (a) of the speech recognition apparatus and a sectional view (b) taken along line A1-A1 shown in (a). 5 is an enlarged view of a portion of FIG. 6A is a right side view of the voice recognition device. Fig. 6B is a cross-sectional view of the grill taken at each part shown in Fig. 6A. 7 is a block diagram showing the control relationship between the main parts constituting the speech recognition apparatus. 8 is an exploded perspective view of the cover. Fig. 9 shows a cover in which a window is removed. 10 is an exploded view of the state before the voice input PCB is coupled to the window support. 11 is a sectional view of the state in which the voice input PCB is coupled to the window support.

도 3 내지 도 11을 참조하면, 본 발명의 일 실시예에 따른 음성 인식 장치(1)는 커버(10), 본체(40), 그릴(20) 및 베이스(30)를 포함할 수 있다. 본체(40)는 하측에 위치한 베이스(30)에 의해 지지되고, 본체(40)의 상부에는 커버(10)가 결합될 수 있다. 3 to 11, a speech recognition apparatus 1 according to an embodiment of the present invention may include a cover 10, a main body 40, a grill 20, and a base 30. The main body 40 is supported by the base 30 positioned at the lower side and the cover 10 can be coupled to the upper side of the main body 40.

본체(40)는 그릴(20) 내에 배치된다. 본체(40) 전부가 완전히 그릴(20) 내에 배치되어야 하는 것은 아니며, 실시예에서와 같이, 본체(40)는 일부분이 그릴(20)의 상단을 통해 돌출될 수 있다. 그릴(20)은 다수개의 통공(20h)이 형성되고, 상하로 긴 원통형으로 이루어지며, 본체(40)를 둘러싼다. The body (40) is disposed in the grill (20). Not all of the body 40 has to be placed completely within the grill 20, and the body 40 can be partially projected through the top of the grill 20, as in the embodiment. The grill 20 is formed with a plurality of through holes 20h and is formed in a vertically long cylindrical shape and surrounds the main body 40.

통공(20h)을 통해 그릴(20) 내측으로 먼지가 들어가지 않도록, 그릴(20)의 내측면에는 다공성의 필터(미도시)가 부착될 수 있다. 상기 필터는 메쉬 또는 부직포 등의 미세한 소공을 갖는 재질로 이루어질 수 있다. 상기 필터는 양면 테이프 등의 접착 부재에 의해 그릴(20)의 내측면에 부착될 수 있다. 상기 필터는 그릴(20) 내에 배치된 스피커(43, 44), 본체 케이스 등의 구성이 통공(20h)을 통해 외부에서 보여지지 않도록 은폐하는 역할도 겸한다. A porous filter (not shown) may be attached to the inner surface of the grill 20 so that dust does not enter the grill 20 through the through hole 20h. The filter may be made of a material having fine pores such as a mesh or a nonwoven fabric. The filter may be attached to the inner surface of the grill 20 by an adhesive member such as double-sided tape. The filter also serves to conceal the structure of the speaker (43, 44), the body case, and the like disposed in the grill (20) so as not to be seen from the outside through the through hole (20h).

한편, 도 3에서는 그릴(20)의 일부분에는 통공(20h)이 형성되고, 다른 부분에는 생략되어 있으나, 이는 어디까지나 도면이 복잡해지는 것을 피하기 위해 생략된 것이며, 통공(20h)은 그릴(20)의 대부분의 영역에 형성되어, 후술하는 스피커(43, 44)로부터 출력된 소리가, 통공(20h)들을 통해 전, 후, 좌, 우, 사방으로 고르게 퍼져 나갈 수 있다. 3, the through hole 20h is formed in a part of the grill 20 and is omitted in the other parts. However, the through hole 20h is omitted in order to avoid complication of the drawing. And the sound outputted from the speakers 43 and 44 to be described later can be spread evenly through the through holes 20h in front, back, left, right, and all directions.

커버(10)는 윈도우(11), 윈도우 서포트(12), 디스플레이(13), 디스플레이 PCB(Printed Circuit Board, 14) 및 커버 하우징(15)을 포함할 수 있다. 윈도우(11), 윈도우 서포트(12), 디스플레이(13) 및 디스플레이 PCB(14)는 커버 하우징(15) 내에 배치될 수 있다. The cover 10 may include a window 11, a window support 12, a display 13, a printed circuit board 14 and a cover housing 15. The window 11, the window support 12, the display 13 and the display PCB 14 may be disposed within the cover housing 15.

도 4 내지 도 5를 참조하면, 커버 하우징(15)은 재질은 합성수지이고, 본체(40)의 상부에 결합되며, 상면에는 개구부(15h)가 형성된다. 커버 하우징(15)은 원통형으로 이루어져, 상단이 개구부(15h)를 한정하는 측벽(151)과, 측벽(151)의 내측면에서 연장되어, 측벽(151) 내부를 상, 하로 구획하는 구획판(152)을 포함할 수 있다. 디스플레이 PCB(14), 디스플레이(13), 윈도우 서포트(12) 및 윈도우(11)는 구획판(152)의 상측에 배치된다. 4 to 5, the cover housing 15 is made of synthetic resin, is coupled to the upper portion of the main body 40, and has an opening 15h formed on the upper surface thereof. The cover housing 15 has a cylindrical shape and includes a side wall 151 defining an opening 15h at an upper end thereof and a partition plate (not shown) extending upward from the inside of the side wall 151, 152). The display PCB 14, the display 13, the window support 12, and the window 11 are disposed on the upper side of the partition plate 152.

측벽(151)의 하단(151a)은, 바람직하게는, 그릴(20)의 상단과 접하나, 공차로 인한 미세한 유격이 양자 사이에 존재할 수 있다. 위에서 내려 다 볼 때, 측벽(151)의 하단(151a)은 그릴(20) 상단과 적어도 일부가 중첩되어 있다. 측벽(151)의 외측면과 그릴(20)의 외측면 사이에는 틈새가 존재하기는 하나, 상기 틈새를 제외하고는, 측벽(151)의 외측면과 그릴(20)의 외측면은 전체적으로 하나의 연속된 외형을 이룬다. The lower end 151a of the side wall 151 preferably contacts the upper end of the grill 20, but a fine clearance due to the clearance may exist between them. The lower end 151a of the side wall 151 overlaps at least a part with the upper end of the grill 20. [ There is a gap between the outer surface of the side wall 151 and the outer surface of the grill 20 but the outer surface of the side wall 151 and the outer surface of the grill 20 except for the above- It forms a continuous contour.

상단 유지부(153)는 측벽(151)의 하단(151a)으로부터 하측으로 연장되어, 그릴(20)과 결합된다. 이와 같은, 상단 유지부(153)와 그릴(20)의 결합은, 볼트 등의 별도의 체결부재를 이용한 것이 아니라, 상단 유지부(153)가 그릴(20)의 상단의 개구부 내로 삽입되는(또는, 끼워지는) 방식이며, 바람직하게는, 그릴(20)이나 상단 유지부(153) 자체가 갖는 탄성/복원력을 이용한 억지 끼움 방식이다. The upper end holding portion 153 extends downward from the lower end 151a of the side wall 151 and is engaged with the grill 20. The upper end holding portion 153 and the grill 20 are not engaged with each other by using a separate fastening member such as a bolt but the upper end holding portion 153 is inserted into the upper end opening of the grill 20 And it is preferably a force fitting type using the elasticity / restoring force of the grill 20 or the upper holding portion 153 itself.

상단 유지부(153)는 측벽(151)의 하단부보다 더 내측에 위치하며(즉, 커버 하우징(15)의 외측면이 측벽(151)의 하단(151a)에서 함몰되어 상단 유지부(153)의 외측면을 이룸.), 따라서, 측벽(151)의 하단에는, 측벽(151)의 외측면으로부터 상단 유지부(153)로 연장되어, 그릴(20)의 상단을 대향하는 면이 형성된다. The upper end holding portion 153 is positioned further inside than the lower end portion of the side wall 151 (i.e., the outer side surface of the cover housing 15 is recessed at the lower end 151a of the side wall 151, A surface facing the upper end of the grill 20 is formed at the lower end of the side wall 151 so as to extend from the outer surface of the side wall 151 to the upper end holding portion 153.

커버 하우징(15)은 측벽(151)의 내측면으로부터 돌기(154)가 돌출될 수 있고, 본체(40)의 전면부에는 돌기(154)와 결합되는 돌기 삽입 홈(418)이 형성될 수 있다. 커버 하우징(15)과 본체(40)를 조립하는 과정에서 돌기(154)가 본체(40)의 외면을 따라 이동하다가, 돌기 삽입 홈(418)에 이르면, 합성수지재로 이루어진 커버 하우징(15) 자체의 탄력성에 의해, 돌기 삽입 홈(418)내로 삽입된다. The cover housing 15 may have protrusions 154 protruding from the inner side surface of the side wall 151 and protrusion insertion grooves 418 may be formed in the front surface of the main body 40 to engage with the protrusions 154 . The protrusion 154 moves along the outer surface of the main body 40 while the cover housing 15 and the main body 40 are assembled with each other. When the protrusion 154 reaches the protrusion insertion groove 418, the cover housing 15 itself made of synthetic resin Is inserted into the projection insertion groove 418 by the elasticity of the projection.

상단 유지부(153)의 외측면이 그릴(20)의 내측면과 접함으로써, 그릴(20) 상단의 형태가 유지된다. 특히, 그릴(20)이 금속재로 이루어진 경우, 그릴(20)은 상단 유지부(153)의 형상에 대응하여 변형이 이루어지기 때문에, 그릴(20)의 상단이 상단 유지부(153)와 대응하는 형태로 유지될 수 있다. The top surface of the upper end holding portion 153 abuts the inner surface of the grill 20, so that the shape of the upper end of the grill 20 is maintained. Particularly, when the grill 20 is made of a metal material, the grill 20 is deformed corresponding to the shape of the upper end holding portion 153, so that the upper end of the grill 20 corresponds to the upper end holding portion 153 . &Lt; / RTI >

한편, 상단 유지부(153)가 측벽(151)의 하단(151a)을 따라 타원형으로 연장되는 경우, 금속 판재를 말아서 단면의 형태가 정원형인 원통형의 그릴(20)을 형성한 후, 그릴(20)의 상단을 상단 유지부(153)에 끼우면, 그릴(20)의 형태 역시 상단 유지부(153)의 형상과 대응하여 타원형으로 변형되고, 이렇게 변형이 이루어진 상태로 유지될 수 있다. On the other hand, when the upper end holding portion 153 extends in an elliptical shape along the lower end 151a of the side wall 151, the metal plate is rolled to form a cylindrical grill 20 having a domed shape in cross section, The shape of the grill 20 is deformed into an elliptical shape corresponding to the shape of the upper end holding portion 153 and can be maintained in the deformed state.

실시예에서와 같이, 반지름이 r인 정원형의 윈도우(11)가 소정의 수평면에 대해 소정의 각도(도 6a에 θ1으로 표시된 것으로, 예각임. 이하, "제1 각도"라고도 명명함.)로 기울어지고, 윈도우(11)의 상면에 대한 법선 벡터(Vs)를 수평면에 정사영하여 얻은 벡터 Vh가 정전방을 향하는 경우에, 윈도우(11)를 수평면에 정사영한 형태는, 전후 방향으로 rcosθ1의 단반경을 갖고, 좌우 방향으로 r의 장반경을 갖는 타원이 된다. 따라서, 음성 인식 장치(1) 외형의 일체감을 위해서는 그릴(20)의 단면 역시 상기 타원과 대응하는 형태(즉, 단반경과 장반경의 비가 cosθ1: 1이 되는 형태)로 이루어지는 것이 바람직하고, 상단 유지부(153)를 상기 타원과 대응하는 형태로 형성함으로써, 그릴(20)의 단면의 형태를 상기 타원과 대응하는 형상으로 유지시킬 수 있는 것이다. 여기서, "대응하는 형태"라 함은 양 도형들의 형태가 서로 완전히 일치하는 경우뿐만 아니라, 양 도형이 서로 닮은 꼴인 경우(예를 들어, 단반경 대 장반경의 비가 양 도형이 같은 경우)도 포함하며, 이하, 같은 의미로 정의한다. As in the embodiment, a garden-like window 11 having a radius r is formed at a predetermined angle (indicated by? 1 in FIG. 6A, an acute angle, hereinafter also referred to as a "first angle") with respect to a predetermined horizontal plane. When the vector Vh obtained by orthogonalizing the normal vector Vs with respect to the upper surface of the window 11 to the horizontal plane is directed toward the static room, the form in which the window 11 is orthogonally projected in the horizontal plane, And has an ellipse having a long radius of r in the left-right direction. Therefore, in order to provide a sense of unity of the outline of the voice recognition apparatus 1, the cross section of the grill 20 is also preferably formed in a shape corresponding to the ellipse (that is, a shape in which the ratio of the minor axis to the major axis is cos? The shape of the cross section of the grill 20 can be maintained in a shape corresponding to the ellipse by forming the concave portion 153 in a shape corresponding to the ellipse. Here, the " corresponding form " includes not only the case where the shapes of the both shapes are completely matched with each other, but also the case where both shapes resemble each other (for example, In the following, the same meaning is defined.

여기서, 수평면에 대해 윈도우(11)가 기울어진 각도 θ1은 일반적인 사용환경에서 사용자의 시선을 고려하여 정해진 것으로써, 음성 인식 장치(1)가 주방의 조리대나 식탁 등의 대략 높이 1m 정도의 거치대에 거치된 경우에, 음성 인식 장치(1)의 전방에 위치한 일반적인 성인의 시선이 윈도우(11)의 상면과 90도 근접한 각도를 이룰 수 있도록 정해지며, 바람직하게는, 대략 20도이나, 반드시 이에 한정되어야 하는 것은 아니다. Here, the angle? 1 at which the window 11 is tilted with respect to the horizontal plane is determined in consideration of the user's gaze in a general use environment, so that the voice recognition device 1 is placed on a cradle The angle of view of a general adult positioned in front of the voice recognition device 1 is set to be close to 90 degrees with the upper surface of the window 11, preferably about 20 degrees, It should not be.

한편, 디스플레이 패널(131)은 표시된 화면이 전방의 상측을 향하도록 수평에 대해 소정 각도로 기울어지게 배치될 수 있고, 바람직하게는, 윈도우(11)와 같은 각도(θ1)로 기울어져 있다. 후술하는 윈도우 지지판(121) 역시 디스플레이 패널(131) (또는, 윈도우(11))과 같은 각도로 기울어져 있다. On the other hand, the display panel 131 may be arranged to be inclined at a predetermined angle with respect to the horizontal so that the displayed screen faces the upper side of the front side, and is preferably inclined at the same angle? 1 as the window 11. The window support plate 121 to be described later is also inclined at the same angle as the display panel 131 (or the window 11).

보다 상세하게, 도 6a 내지 도 6b을 참조하면, 커버 하우징(15)의 측벽(151)의 상단은 외경 L1인 정원형으로 이루어지고, 측벽(151)의 하단(151f)의 외경은, 수평면에 대해 각도 θ2(θ2<θ1. 이하, θ2를 "제2 각도"라고도 명명함.)로 기울어져, 좌우 방향으로는 La의 직경을 갖고, 전후방향으로 Lb의 직경을 갖는 형태이다. 여기서, 측벽(151)의 외측면은 수직선에 소정의 각도(θ3)로 기울어져 있기 때문에, 도 6b의 단면 S1을 수평면에 정사영한 형태와, 단면 S2를 수평면에 정사영한 형태가 정확하게 일치하지는 않으나, θ3이 그 값이 충분이 작다면(바람직하게는, 5도 이하), La는 L1과 근사한 값을 갖는 바, 이하, La=L1라고 가정한다. 더 나아가, θ1과 θ2의 차이가 충분히 작다면(바람직하게는, 5도 이하), Lb 역시 L1과 근사한 값을 갖는 바, 이하, Lb=L1라고 가정한다. 6A and 6B, the upper end of the side wall 151 of the cover housing 15 is of a garden type having an outer diameter L1, and the outer diameter of the lower end 151f of the side wall 151 is a Is inclined at an angle? 2 (? 2 <? 1 or less,? 2 is also referred to as a "second angle") with a diameter of La in the lateral direction and a diameter of Lb in the forward and backward directions. Here, since the outer surface of the side wall 151 is inclined at a predetermined angle? 3 with respect to the vertical line, the end face S1 of FIG. 6B is orthogonal to the horizontal plane, and the end face S2 does not exactly coincide with the horizontal face , and if the value of? 3 is sufficiently small (preferably, 5 degrees or less), La has a value close to L1, and hence La = L1. Further, if the difference between? 1 and? 2 is sufficiently small (preferably, 5 degrees or less), Lb also has a value close to L1, and Lb = L1 is assumed below.

여기서, θ3는 측벽(151)의 외측면이 수직선과 이루는 각도를 표시한 것으로써, 측벽(151)의 외측면 전 구간에서 일정한 값을 가질 수도 있으나, 측벽(151)의 둘레를 따라 그 값이 가변되는 것도 가능하다. The angle θ3 is an angle formed by the outer surface of the side wall 151 and the vertical line. The angle θ3 may have a constant value over the entire outer surface of the side wall 151. However, It is also possible to vary.

한편, 도 6b의 단면 S3과 S4를 참조하면, 그릴(20)은 외경이 타원형(L1>L2)으로써, 좌우 방향으로 L1의 장직경과, 전후 방향으로 L2의 단직경을 갖는다. 여기서, 위에서 가정한 바와 같이, La=L1, Lb=L1라고 할 때, L2는 L1cosθ1이 된다. 즉, 수평면에 정사영한 그릴(20)의 외형은 전후 방향의 직경(L2)이 좌우방향의 직격(L1)보다 짧은 형태의 타원이 된다. 윈도우(11)가 기울어져 배치되었음에도, 음성 인식 장치(1)를 위에서 내려다 볼 시, 그 외형이 전체적으로 타원형이 되어 일체감을 이룬다. 6B, the grill 20 has an outer diameter of elliptical shape (L1> L2), a long diameter L1 in the left-right direction and a short diameter L2 in the front-rear direction. Here, as assumed above, when La = L1 and Lb = L1, L2 becomes L1 cos? 1. That is, the outer shape of the grill 20 having a regular view on the horizontal plane is an ellipse having a shape in which the diameter L2 in the front-rear direction is shorter than the straightness L1 in the lateral direction. When the window 11 is tilted and disposed, when the speech recognition device 1 is viewed from above, its external shape becomes elliptical in its entirety to provide a sense of unity.

측벽(151)은 그릴(20)의 상측에 위치하기 때문에, 음성 인식 장치(1)의 외관을 이루나, 상단 유지부(153)는 그릴(20)의 내측으로 완전히 삽입되어, 그릴(20)에 의해 가려짐으로써, 음성 인식 장치(1)의 외관상 보이지 않는다. The upper end holding portion 153 is completely inserted into the inside of the grill 20 and the upper end holding portion 153 is inserted into the upper portion of the grill 20, The voice recognition device 1 is not visible.

측벽(151)의 하단으로부터 위치 설정 돌기(미도시)가 돌출될 수 있고, 그릴(20)의 상단에는 그릴(20)이 정위치된 상태에서 위치 설정 돌기가 삽입되는 위치 설정 홈이 형성될 수 있다. A positioning protrusion (not shown) may protrude from a lower end of the side wall 151 and a positioning groove may be formed at an upper end of the grill 20 to insert the positioning protrusion in a state where the grill 20 is properly positioned. have.

윈도우(11)는 커버 하우징(15)의 개구부(15h) 내에 배치될 수 있다. 윈도우(11)는 일정한 두께의 투명판을 가공한 것으로써, 측면(또는, 외주면)이 상면 및 하면과 직교한다. The window 11 may be disposed in the opening 15h of the cover housing 15. [ The window 11 is formed by processing a transparent plate having a constant thickness, and the side surface (or the outer peripheral surface) is orthogonal to the upper surface and the lower surface.

커버 하우징(15)의 내측면은, 커버 하우징(15)의 상단으로부터 하측으로 연장된 일정 부분(151b)이, 윈도우(11)의 상면이 향하는 방향(즉, 도 6a에서 법선 벡터 Vs이 향하는 방향)과 평행하다. 측벽(151)의 상단부 내측면(151a)은 개구부(15h)를 한정하는 면인 바, 이하, 측벽(151)의 상단부 내측면(151b)을 개구부 한정면이라고 한다. 개구부 한정면(151b)은 개구부(15h)의 둘레부터 연장된 원통형으로 이루어지고, 개구부 한정면(151b)에 의해 둘러싸인 내측에 윈도우(11)가 배치된다. 바람직하게는, 윈도우(11)의 상면이 커버 하우징(15)의 상단과 같은 평면(또는, 개구부(15h)가 속하는 평면)에 속함으로써, 음성 인식 장치(1)의 상면이 하나의 평면으로 이루어진 것 같은 일체감이 조성된다. The inner surface of the cover housing 15 is formed so that a certain portion 151b extending downward from the upper end of the cover housing 15 is positioned in a direction in which the upper surface of the window 11 faces ). An upper end inner side surface 151a of the side wall 151 is a bar that defines the opening 15h. Hereinafter, the upper end inner side surface 151b of the side wall 151 is referred to as an opening limiting surface. The opening-defining surface 151b is formed in a cylindrical shape extending from the periphery of the opening 15h, and the window 11 is disposed on the inside surrounded by the opening-defining surface 151b. Preferably, the upper surface of the window 11 belongs to the same plane as the upper end of the cover housing 15 (or the plane to which the opening 15h belongs), so that the upper surface of the voice recognition device 1 is made up of one plane A sense of unity is created.

개구부 한정면(151b)은 어느 위치에서나 벡터 Vs와 평행한 면으로 이루어진다. 즉, 벡터 Vs와 평행한 임의의 평면으로 커버 하우징(15)을 절단하더라도, 그 단면 상에서 개구부 한정면(151b)은 벡터 Vs와 평행하다. The opening-defining surface 151b is formed in a plane parallel to the vector Vs at any position. That is, even if the cover housing 15 is cut by an arbitrary plane parallel to the vector Vs, the opening defining surface 151b on the section is parallel to the vector Vs.

개구부 한정면(151b)과 윈도우(11)의 측면이 평행하기 때문에, 윈도우(11)의 중심과 개구부 한정면(151b)의 중심을 벡터 Vs을 따라 정렬시키면, 윈도우(11)의 측면의 전 구간이 개구부 한정면(151b)과 일정한 간격(g)을 유지할 수 있는 효과가 있다. 이는 음성 인식 장치(1)를 위에서 내려다 보았을 시, 윈도우(11)와 커버 하우징(15)의 상단 사이에 일정한 간격(g)이 유지되는 것으로써, 제품의 완성도가 높아 보이는 것이기도 하다. 간격(g)은, 컨택 스위치(미도시)를 동작시키기 위해 윈도우(11)가 가압될 시, 윈도우(11)의 측면이 개구부 한정면(151b)과 간섭되지 않도록 하는 조건하에서 최소로 설정되는 것이 바람직하다. 컨택 스위치는 기판에 형성된 회로와 전기적으로 연결되어 사용자의 버튼 조작을 수신할 수 있다. When the center of the window 11 and the center of the opening defining surface 151b are aligned along the vector Vs because the opening defining surface 151b is parallel to the side surface of the window 11, It is possible to maintain a constant gap g with the opening-defining surface 151b. This is because the gap g is maintained between the window 11 and the upper end of the cover housing 15 when the speech recognition device 1 is viewed from the top, The gap g is set to a minimum under the condition that the side surface of the window 11 does not interfere with the opening defining surface 151b when the window 11 is pressed to operate the contact switch desirable. The contact switch may be electrically connected to a circuit formed on the substrate to receive a user's button operation.

커버 하우징(15)을 임의의 수직한 평면으로 절단한 경우, 그 절단면 상에서, 측벽(151)의 외측면은 법선 벡터(Vs)와 평행하거나, 하측으로 갈수록 법선 벡터(Vs)로부터 점점 멀어지는 형태를 이룰 수 있다. 커버 하우징(15)을 사출 성형할 시, 측벽(151)을 형성하는 제1 금형으로부터 커버 하우징(15)은 연직 하방으로 취출된다. 따라서, 상기 제1 금형으로부터 커버 하우징(15)이 잘 빠질 수 있기 위해서는 측벽(151)의 외측면이 위와 같은 형태가 되어야 하는 것이다. When the cover housing 15 is cut into an arbitrary vertical plane, the outer side surface of the side wall 151 is parallel to the normal vector Vs or gradually away from the normal vector Vs toward the lower side Can be achieved. When the cover housing 15 is injection-molded, the cover housing 15 is taken out from the first mold forming the side wall 151 vertically downward. Accordingly, in order for the cover housing 15 to be detached from the first metal mold, the outer surface of the side wall 151 should have the above-described shape.

이에 반해, 커버 하우징(15)의 상면에 개구부(15h)를 형성하기 위해서는 개구부(15h)내로 삽입되는 코어를 갖는 별도의 제2 금형이 필요하다. 상기 제1 금형이 제거된 상태에서, 상기 제2 금형을 이동시킴으로써, 커버 하우징(15)을 상기 제2 금형으로부터 분리할 수 있는데, 이때, 상기 제2 금형의 이동은 법선 벡터(Vs)와 동일한 방향으로 이루어진다. On the other hand, in order to form the opening 15h on the upper surface of the cover housing 15, a separate second mold having a core inserted into the opening 15h is required. The cover housing 15 can be separated from the second mold by moving the second mold while the first mold is removed. At this time, the movement of the second mold is equal to the normal vector Vs Direction.

도 5와 도 8을 참조하면, 디스플레이 PCB(14)는 구획판(152)의 상면에 배치되어, 디스플레이(13)를 하측에서 지지한다. 디스플레이 PCB(14)는 디스플레이(13)와 전기적으로 연결되는 회로를 포함하여 구성되며, 디스플레이(13)는 커넥터(132)를 통해 상기 회로와 연결되어 있다. 디스플레이 PCB(14)의 상면에는, 디스플레이(13)를 기준으로 전, 후, 좌, 우로 4개의 컨택 스위치가 배치될 수 있다. 5 and 8, the display PCB 14 is disposed on the upper surface of the partition plate 152 to support the display 13 from the lower side. The display PCB 14 comprises a circuit which is electrically connected to the display 13 and the display 13 is connected to the circuit via a connector 132. On the top surface of the display PCB 14, four contact switches may be arranged in front, back, left, and right with respect to the display 13. [

디스플레이 PCB(14)에는 NFC 모듈(50d, 도 7 참조.)이 배치될 수 있다. NFC 모듈(50d)은 NFC 통신을 가능하게 하는 것으로써, 제2 기판 암(146)에 형성된 NFC 장착부(146a)에 배치될 수 있다. NFC(Near Field Communication)는 무선태그(RFID) 기술 중 하나로 13.56MHz의 주파수 대역을 사용하는 비접촉식 통신 기술이다. 통신거리가 짧기 때문에 상대적으로 보안이 우수하고 가격이 저렴해 주목받는 차세대 근거리 통신 기술이다. 데이터 읽기와 쓰기 기능을 모두 사용할 수 있기 때문에 기존에 RFID 사용을 위해 필요했던 동글(리더)이 필요하지 않다. 블루투스 등 기존의 근거리 통신 기술과 비슷하지만 블루투스처럼 기기 간 설정을 하지 않아도 되는 이점이 있다. The display PCB 14 may be provided with an NFC module 50d (see FIG. 7). The NFC module 50d may be disposed in the NFC mounting portion 146a formed on the second substrate arm 146 by enabling NFC communication. Near Field Communication (NFC) is a non-contact communication technology that uses the frequency band of 13.56 MHz as one of radio frequency identification (RFID) technologies. It is a next-generation LAN technology that is attracting attention due to its relatively low security and relatively low price because of its short communication distance. Since both data reading and writing functions are available, there is no need for a dongle (reader) that was previously required for RFID use. It is similar to conventional short range communication technologies such as Bluetooth, but it does not need to set up between devices like Bluetooth.

디스플레이(13)는 전기 신호를 입력 받아 화상을 표시하는 장치로써, 디스플레이 PCB(14)의 회로에 접속되어 있으며, 상기 회로를 통해 입력된 제어 신호에 따라 화상을 표시한다. 디스플레이(13)는 디스플레이 패널(131)과, 디스플레이 패널(131)을 디스플레이 PCB(14)의 회로와 연결하는 커넥터(132)를 포함할 수 있다 (도 8 참조.). 디스플레이 패널(131)은 접착부재에 의해 디스플레이 PCB(14)의 상면에 부착될 수 있다. The display 13 is an apparatus for receiving an electric signal and displaying an image. The display 13 is connected to a circuit of the display PCB 14, and displays an image according to a control signal input through the circuit. The display 13 may include a display panel 131 and a connector 132 for connecting the display panel 131 to the circuitry of the display PCB 14 (see FIG. 8). The display panel 131 may be attached to the upper surface of the display PCB 14 by an adhesive member.

디스플레이 PCB(14)는 후술하는 메인 PCB(미도시)와 소정의 케이블을 통해 회로 연결되어 있으며, 따라서, 디스플레이(13)를 제어하기 위한 제어부는 디스플레이 PCB(14)나 메인 PCB 중 어느 곳에 실장되어도 무방하다. The display PCB 14 is connected to a main PCB (not shown) via a predetermined cable so that a control unit for controlling the display 13 is mounted on the display PCB 14 or the main PCB It is acceptable.

디스플레이 패널(131)의 화면에는 각종 정보들이 표시될 수 있다. 제어부(240)는 메모리(250)에 저장된 프로그램에 따라 디스플레이 패널(131)의 구동뿐만 아니라, 음성 인식 장치(1)를 구성하는 전장품에 대한 작동 전반을 제어할 수 있다. 디스플레이 패널(131)을 통해 사용자 인터페이스(UI: User Interface)가 표시될 수 있으며, 이러한 인터페이스는 상기 프로그램이 실행됨으로써 구현된다. Various information can be displayed on the screen of the display panel 131. The controller 240 can control not only the operation of the display panel 131 according to the program stored in the memory 250 but also the overall operation of the electrical components constituting the voice recognition device 1. [ A user interface (UI) can be displayed through the display panel 131, and the interface is implemented by executing the program.

상기 인터페이스는 스피커(43, 44)의 재생 정보를 표시할 수 있다. 예를 들어, 음악의 재생/정지/선곡 메뉴와, 재생 상태, 곡명, 가수/음반 정보, 가사, 볼륨 등의 각종 정보가 표시될 수 있다. The interface may display playback information of the speakers 43 and 44. For example, various information such as a music reproduction / stop / selection menu, a reproduction status, a song name, an artist / record information, a lyrics, and a volume can be displayed.

음성 인식 장치(1)에 통신 모듈(50)이 구비된 경우, 상기 인터페이스는 통신 모듈(50)을 통해 주고 받은 정보를 표시할 수 있다. 예를 들어, 상기 인터페이스는 통신 모듈(50)과 통신하는 액세서리(2, 3a, 3b)들을 제어하기 위한 메뉴를 표시하거나, 액세서리(2, 3a, 3b)들로부터 전송된 정보를 바탕으로 가공된 정보를 표시할 수 있다. 구체적으로, 상기 인터페이스를 통해, 통신 모듈(50)의 네트워크 연결상태, 액세서리(2)에 구비된 센서에 의해 감지된 온도, 습도, 밝기 등의 정보가 표시될 수 있다. 뿐만 아니라, 상기 인터페이스를 통해, 스피커(43, 44)의 출력을 제어하는 메뉴가 표시될 수도 있으며, 예를 들어, 스피커(43, 44)를 통해 출력할 노래나 앨범을 선택하는 메뉴, 상기 앨범이나 노래와 관련된 정보(예를 들어, 노래 제목, 앨범 명, 가수), 출력되고 있는 볼륨의 크기 등이 표시될 수 있다. When the voice recognition device 1 is equipped with the communication module 50, the interface can display information exchanged through the communication module 50. For example, the interface may display a menu for controlling the accessory 2, 3a, 3b in communication with the communication module 50, or may display a menu for controlling the accessories 2, 3a, 3b based on the information transmitted from the accessories 2, Information can be displayed. Specifically, information such as the network connection status of the communication module 50, temperature, humidity, brightness, and the like sensed by the sensor provided in the accessory 2 can be displayed through the interface. In addition, a menu for controlling the output of the speakers 43 and 44 may be displayed through the interface, for example, a menu for selecting a song or an album to be outputted through the speakers 43 and 44, (E.g., song title, album name, artist), the size of the volume being output, and the like.

통신 모듈(50)은, 다양한 방식의 통신 모듈을 구비할 수 있다. 예를 들어, 통신 모듈(50)은, 와이파이 모듈(50a), 블루투스 모듈(50b), 직비 모듈(50c), NFC 모듈(50d) 등을 구비할 수 있다. The communication module 50 may include various communication modules. For example, the communication module 50 may include a Wi-Fi module 50a, a Bluetooth module 50b, a position module 50c, and an NFC module 50d.

한편, 상기 인터페이스에 표시된 메뉴들에 대한 조작은, 조작부(181)를 통해 가능한다. On the other hand, operations on the menus displayed on the interface are possible through the operation unit 181. [

한편, 조작부(181)는, 컨택 스위치를 구비할 수 있다. 각각의 컨택 스위치의 출력 신호가 어떻게 처리되는지는 메모리(250)에 기 저장된 프로그램에 의해 정해진다. 예를 들어, 제1, 2 컨택 스위치들(181a, 181b)의 동작 신호에 따라, 상기 인터페이스 상에서 좌우로 표시된 메뉴들이 선택될 수 있으며, 제3, 4 컨택 스위치들(181c, 181d)의 동작 신호에 따라 상기 인터페이스 상에서 상하로 표시된 메뉴들이 선택될 수 있다. On the other hand, the operation unit 181 may include a contact switch. How the output signals of the respective contact switches are processed is determined by the program stored in the memory 250. For example, menus displayed left and right on the interface may be selected according to the operation signals of the first and second contact switches 181a and 181b, and the operation signals of the third and fourth contact switches 181c and 181d The menus displayed up and down on the interface may be selected.

사용자는 스마트폰, 노트북 등의 외부 기기를 이용하여, 블루투스 모듈(50b)과 통신할 수 있으며, 이를 통해 음악, 이미지 등의 각종 데이터가 메모리(250)에 저장될 수 있다. 특히, 제어부(240)는 메모리(250)에 저장된 음악이 출력되도록, 스피커(43, 44)를 제어할 수 있고, 음악의 선곡, 재생, 정지 등의 각종 기능이 컨택 스위치를 통해 구현될 수 있다. The user can communicate with the Bluetooth module 50b using an external device such as a smart phone or a notebook computer, and various data such as music and images can be stored in the memory 250. [ In particular, the controller 240 can control the speakers 43 and 44 to output the music stored in the memory 250, and various functions such as music selection, playback, and stop can be implemented through the contact switch .

도 8, 도 9를 참조하면, 디스플레이(13)의 상측에는 대략 원형의 윈도우 서포트(12)가 배치될 수 있다. 윈도우 서포트(12)는 합성수지재의 사출물로써, 바람직하게는 하나의 부품으로 형성된다. 윈도우 서포트(12)에는 개구부가 형성되며, 개구부를 통해 디스플레이(13)의 화면이 노출된다. Referring to FIGS. 8 and 9, a substantially circular window support 12 may be disposed above the display 13. The window support 12 is an injection molding of a synthetic resin material, and is preferably formed of one component. An opening is formed in the window support 12, and a screen of the display 13 is exposed through the opening.

윈도우 서포트(12)는 중심부에 개구부(12h)가 형성되고, 상면에 윈도우(11)가 배치되는 윈도우 지지판(121)과, 윈도우 지지판(121)으로부터 하방으로 돌출된 조작 돌기들과, 윈도우 지지판(121)으로부터 하방으로 돌출되는 다수개의 지지보스(122a, 122b, 122c, 122d)를 포함할 수 있다. The window support 12 includes a window support plate 121 having an opening 12h formed at its center and having a window 11 on its upper surface, operation protrusions projecting downward from the window support plate 121, 122b, 122c, 122d projecting downwardly from the upper and lower support pins 121, 121, respectively.

지지보스들(122a, 122b, 122c, 122d)은 하측으로 수직하게 연장될 수 있다. 윈도우(11)와 마찬가지로, 윈도우 지지판(121)도 수평면에 대해 제1 각도(θ1)로 기울어지게 배치될 수 있고, 이 경우, 지지보스들(122a, 122b, 122c, 122d)은 윈도우 지지판(121)과 직교하지 않으며, 바람직하게는 윈도우 지지판(121)과 θ1의 여각(90-θ1)을 이룬다. The support bosses 122a, 122b, 122c, 122d may extend vertically downward. In this case, the support bosses 122a, 122b, 122c, and 122d may be disposed at the first angle? 1 relative to the horizontal plane, as in the case of the window 11, (90 -? 1) of? 1 with the window support plate 121, as shown in FIG.

이하, 윈도우 지지판(121)의 각 부분을, 도 9에 도시된 바와 같이, 개구부(12h)를 중심으로 후방에 위치하는 제1 영역(SE1), 전방에 위치하는 제2 영역(SE2), 좌측에 위치하는 제3 영역(SE3) 및 우측에 위치하는 제4 영역(SE4)으로 구분한다. 9, each portion of the window support plate 121 is divided into a first region SE1 located at the rear side with respect to the opening 12h, a second region SE2 located at the front side, And a fourth region SE4 located on the right side.

지지보스들(122a, 122b, 122c, 122d)은 제1 영역(SE1)과 제2 영역(SE2)에 각각 적어도 하나씩이 형성될 수 있다. 윈도우 지지판(121)이 흔들리지 않고, 안정적으로 지지될 수 있도록, 4개의 지지보스(122a, 122b, 122c, 122d)가 형성될 수 있고, 이들 중, 제1 지지보스(122a)와 제2 지지보스(122b)는 제1 영역(SE1)에 위치하고, 제3 지지보스(122c)와 제4 지지보스(122d)는 제2 영역(SE2)에 형성될 수 있다. At least one of the support bosses 122a, 122b, 122c, and 122d may be formed in each of the first region SE1 and the second region SE2. The four support bosses 122a, 122b, 122c, and 122d can be formed so that the window support plate 121 can be stably supported without being shaken. Of these, the first support boss 122a and the second support boss 122b, The third support boss 122c and the fourth support boss 122d may be formed in the second region SE2.

한편, 윈도우(11)는 원형의 투명한 판으로써, 디스플레이(13)의 화면을 투과시키며, 재질은 바람직하게는 아크릴이다. 사용자는 윈도우(11)를 통해 디스플레이(13)에 표시된 화면을 볼 수 있다. 윈도우(11)는 전 영역이 투명할 필요는 없다. 음성 인식 장치(1)의 외관상 윈도우 서포트(12)는 보이지 않고, 윈도우 서포트(12)의 개구부(12h)를 통해 노출되는 디스플레이 패널(131)의 화면만이 보일 수 있다. 대략 개구부(12h)와 대응하는 위치의 소정 영역(11b)만 투명하고, 그 이외의 영역(11a)은 은폐되도록 불투명 또는 반투명하게 착색되거나, 필름 등이 부착될 수 있다 (도 3 참조.). On the other hand, the window 11 is a circular transparent plate, which transmits the screen of the display 13, and the material is preferably acrylic. The user can see the screen displayed on the display 13 through the window 11. [ The entire area of the window 11 does not need to be transparent. Only the screen of the display panel 131 exposed through the opening 12h of the window support 12 can be seen without the window support 12 on the outer side of the voice recognition device 1 being visible. Only a predetermined region 11b at a position substantially corresponding to the opening 12h is transparent and the other region 11a may be colored opaque or translucent so as to be concealed or a film or the like may be adhered thereto.

윈도우(11)는 양면 테이프 등을 이용하여 윈도우 서포트(12)의 윈도우 지지판(121)의 상면에 접착될 수 있다. 윈도우(11)는 합성수지 재질의 특성상, 일정한 한도 내에서 압력이 작용하였을 시, 탄력적으로 휘어질 수 있다. 이러한 휘어짐은 컨택 스위치의 동작이 보다 원활하게 이루어지도록 한다. 그러나, 윈도우(11)의 휘어짐은 탄력적인 것이기 때문에, 누르는 압력이 제거되었을 시에는 원형태로 복원됨은 물론이다. The window 11 can be adhered to the upper surface of the window support plate 121 of the window support 12 using a double-sided tape or the like. The window 11 can be flexibly bent when pressure is applied within a certain limit due to the characteristics of the synthetic resin material. This deflection makes the operation of the contact switch more smooth. However, since the warp of the window 11 is elastic, it goes without saying that when the pressing pressure is removed, the window 11 is restored to its original shape.

도 3 내지 도 5를 참조하면, 본체(40)는 하측에 배치된 베이스(30)에 의해 지지되고, 상단부는 커버 하우징(15)과 결합될 수 있다. 본체(40)는 내측으로 캐비티(49)를 형성하는 스피커 케이스와, 캐비티(49) 내에 배치되는 적어도 하나의 스피커(43, 44)를 포함할 수 있다. 실시예에서는 스피커 케이스 내에 2 개의 스피커(43, 44)가 상하로 배치되고, 상측에 배치되는 스피커(43)는 고음 대역을 출력하는 트위터(tweeter)이고, 하측에 배치되는 스피커(44)는 저음 대역을 출력하는 우퍼(woofer)이다. 3 to 5, the main body 40 is supported by the base 30 disposed at the lower side, and the upper end can be engaged with the cover housing 15. The body 40 may include a speaker case defining a cavity 49 inwardly and at least one speaker 43, 44 disposed within the cavity 49. [ In the embodiment, two speakers 43 and 44 are arranged in the upper and lower parts in the speaker case, a speaker 43 disposed on the upper side is a tweeter for outputting a high frequency band, It is a woofer that outputs the band.

도 3 내지 도 11을 참조하면, 커버(10)에는 음성입력 PCB(17, 18)가 설치된다. 음성입력 PCB(17, 18)에는 사용자의 음성이 입력된다. 음성입력 PCB(17, 18)는 본체(40)에 배치된 음성인식 PCB(40a)와 회로 연결된다. 음성입력 PCB(17, 18)는 하네스 케이블(17b, 18b)을 통해 음성인식 PCB(40a)와 연결될 수 있다. 음성입력 PCB(17, 18)는 상기 입력된 사용자의 음성을 음성인식 PCB(40a)가 인식할 수 있는 음파신호로 변환하고, 음성인식 PCB(40a)는 음성입력 PCB(17, 18)로부터 입력되는 음파신호를 분석하여 사용자의 음성을 인식할 수 있다. Referring to FIGS. 3 to 11, voice input PCBs 17 and 18 are installed on the cover 10. The voice of the user is input to the voice input PCBs 17 and 18. The voice input PCBs 17 and 18 are connected to the voice recognition PCB 40a disposed in the main body 40 in a circuit manner. The voice input PCBs 17 and 18 may be connected to the voice recognition PCB 40a through harness cables 17b and 18b. The voice input PCBs 17 and 18 convert the input voice of the user into sound wave signals that can be recognized by the voice recognition PCB 40a and the voice recognition PCB 40a inputs voice signals from the voice input PCBs 17 and 18 The user's voice can be recognized by analyzing the sound wave signal.

음성입력 PCB(17, 18)는 윈도우 서포트(12)에 설치된다. 음성입력 PCB(17,18)는 다수개가 구비될 수 있고, 상기 다수개의 음성입력 PCB(17,18)는 윈도우 서포트(12)에 형성된 개구부에 대해 대칭(symmetry)으로 배치될 수 있다. 본 실시예에서 음성입력 PCB(17, 18)는 2개로 구비되어, 제1 음성입력 PCB(17) 및 제2 음성입력 PCB(18)를 포함한다. The audio input PCBs 17 and 18 are installed in the window support 12. A plurality of audio input PCBs 17 and 18 may be provided and the plurality of audio input PCBs 17 and 18 may be disposed symmetrically with respect to the openings formed in the window support 12. In this embodiment, the audio input PCBs 17 and 18 are provided in two and include a first audio input PCB 17 and a second audio input PCB 18.

제1 음성입력 PCB(17, 18)는 윈도우 서포트(12)에 형성된 개구부를 기준으로 전방에 위치하는 제1 음성입력 PCB(17)와, 개구부를 기준으로 후방에 위치하는 제2 음성입력 PCB(18)를 포함한다. 제1 음성입력 PCB(17)는 윈도우 지지판(121)의 제2 영역(SE2)에 배치되고, 제2 음성입력 PCB(18)는 윈도우 지지판(121)의 제1 영역(SE1)에 배치된다. The first audio input PCB 17 includes a first audio input PCB 17 positioned in front of the opening formed in the window support 12 and a second audio input PCB 17 located in the rear of the opening, 18). The first audio input PCB 17 is disposed in the second area SE2 of the window support plate 121 and the second audio input PCB 18 is disposed in the first area SE1 of the window support plate 121. [

제1 음성입력 PCB(17)는 윈도우 지지판(121)의 센터를 기준으로 우측으로 치우쳐서 배치되고, 제2 음성입력 PCB(18)는 윈도우 지지판(121)의 센터를 기준으로 좌측으로 치우쳐서 배치된다. The first voice input PCB 17 is disposed biased to the right with respect to the center of the window support plate 121 and the second voice input PCB 18 is disposed biased to the left with respect to the center of the window support plate 121.

한편, 본체(40)의 좌, 우측 중 어느 한쪽에는 직비 모듈(50c)이 구비될 수 있다. On the other hand, one of the left and right sides of the main body 40 may be provided with a position control module 50c.

한편, 윈도우 서포트(12)에는 윈도우(11)와 대향하는 상면에 음성입력 PCB(17, 18)가 수용되는 PCB 수용홈(12a, 12b)이 형성된다. 음성입력 PCB(17, 18)는 PCB 수용홈(12a, 12b)에 수용된 상태일 때, PCB 수용홈(12a, 12b)의 외측으로 돌출되지 않는다. 즉, PCB 수용홈(12a, 12b)은 윈도우 서포트(12)의 상면에 음성입력 PCB(17, 18)의 상하두께와 대응하는 깊이로 함입되어 형성된다. 음성입력 PCB(17, 18)가 PCB 수용홈(12a, 12b)에 수용된 상태이면, 음성입력 PCB(17, 18)의 상면은 윈도우 서포트(12)의 상면과 일치한다. On the other hand, in the window support 12, PCB accommodating grooves 12a and 12b are formed on the upper surface opposite to the window 11 to accommodate the audio input PCBs 17 and 18. The voice input PCBs 17 and 18 do not protrude outside the PCB receiving grooves 12a and 12b when the voice input PCBs 17 and 18 are housed in the PCB receiving grooves 12a and 12b. That is, the PCB receiving grooves 12a and 12b are formed on the upper surface of the window support 12 so as to have a depth corresponding to the upper and lower thicknesses of the voice input PCBs 17 and 18, respectively. The upper surfaces of the audio input PCBs 17 and 18 coincide with the upper surface of the window support 12 when the audio input PCBs 17 and 18 are accommodated in the PCB accommodating grooves 12a and 12b.

PCB 수용홈(12a, 12b)은 제1 음성입력 PCB(17)가 수용되는 제1 PCB 수용홈(12a)과, 제2 음성입력 PCB(18)가 수용되는 제2 음성음력 PCB 삽입부(12b)를 포함한다. 제1 PCB 수용홈(12a)은 윈도우 서포트(12)에 형성된 개구부(12h)를 기준으로 전방에 배치되고, 제2 PCB 수용홈(12b)은 개구부(12h)를 기준으로 후방에 배치된다. 제1 PCB 수용홈(12a)은 윈도우 지지판(121)의 제2 영역(SE2)에 형성되고, 제2 PCB 수용홈(12b)은 윈도우 지지판(121)의 제1 영역(SE1)에 형성된다. The PCB receiving grooves 12a and 12b include a first PCB receiving groove 12a in which the first voice input PCB 17 is accommodated and a second voice loud power PCB inserting portion 12b in which the second voice input PCB 18 is accommodated ). The first PCB receiving groove 12a is disposed on the front side of the opening 12h formed in the window support 12 and the second PCB receiving groove 12b is disposed on the rear side of the opening 12h. The first PCB receiving groove 12a is formed in the second region SE2 of the window supporting plate 121 and the second PCB receiving groove 12b is formed in the first region SE1 of the window supporting plate 121. [

제1 PCB 수용홈(12a)은 윈도우 지지판(121)의 센터를 기준으로 우측으로 치우쳐서 형성되고, 제2 PCB 수용홈(12b)은 윈도우 지지판(121)의 센터를 기준으로 좌측으로 치우쳐서 형성된다. The first PCB receiving groove 12a is formed to be biased to the right with respect to the center of the window supporting plate 121 and the second PCB receiving groove 12b is formed to be biased to the left with respect to the center of the window supporting plate 121.

윈도우 서포트(12)는 PCB 수용홈(12a, 12b)의 바닥으로부터 돌출된 위치설정 돌기(12c, 12d)를 더 포함한다. 음성입력 PCB(17, 18)에는 위치설정 돌기(12c, 12d)가 삽입되는 위치설정 홀(17a, 18a)이 형성된다. 위치설정 돌기(12c, 12d)는 사각형상의 PCB 수용홈(12a, 12b)의 모서리에 하나가 형성되고, 위치설정 홀(17a, 18a)은 사각형상의 음성입력 PCB(17, 18)의 모서리에 하나가 형성된다. 작업자는 음성입력 PCB(17, 18)를 PCB 수용홈(12a, 12b)에 수용할 시, 위치설정 홀(17a, 18a)을 위치설정 돌기(12c, 12d)에 끼워서, 음성입력 PCB(17, 18)를 PCB 수용홈(12a, 12b)의 정확한 위치에 수용시킬 수 있다. The window support 12 further includes positioning projections 12c and 12d protruding from the bottom of the PCB receiving grooves 12a and 12b. The voice input PCBs 17 and 18 are formed with positioning holes 17a and 18a into which the positioning projections 12c and 12d are inserted. One of the positioning projections 12c and 12d is formed at the corner of the rectangular PCB receiving grooves 12a and 12b and the positioning holes 17a and 18a are formed at the corners of the rectangular voice input PCB 17 and 18 . The operator inserts the positioning holes 17a and 18a into the positioning projections 12c and 12d when the audio input PCB 17 and 18 are received in the PCB receiving grooves 12a and 12b, 18 can be received at the correct positions of the PCB receiving grooves 12a, 12b.

위치설정 돌기(12c, 12d)는 제1 PCB 수용홈(12a)이 바닥에서 상측으로 돌출 형성되는 제1 위치설정 돌기(12c)와, 제2 PCB 수용홈(12b)이 바닥에서 상측으로 돌출 형성되는 제2 위치설정 돌기(12d)를 포함한다. 그리고, 위치설정 홀(17a, 18a)은 제1 음성입력 PCB(17)에 형성되어 제1 위치설정 돌기(12c)가 삽입되는 제1 위치설정 홀(17a)과, 제2 음성입력 PCB(18)에 형성되어 제2 위치설정 돌기(12d)가 삽입되는 제2 위치설정 홀(18a)을 포함한다. The positioning projections 12c and 12d include a first positioning protrusion 12c in which the first PCB receiving groove 12a protrudes upward from the bottom and a second positioning protrusion 12b in which the second PCB receiving groove 12b protrudes upward And a second positioning protrusion 12d. The positioning holes 17a and 18a are formed in the first audio input PCB 17 and have a first positioning hole 17a into which the first positioning projection 12c is inserted and a second positioning hole 17b through which the second audio input PCB 18 And a second positioning hole 18a formed in the second positioning projection 12d for receiving the second positioning projection 12d.

PCB 수용홈(12a, 12b)의 바닥에는 개구부(12e, 12f)가 형성된다. 개구부(12e, 12f)는 음성입력 PCB(17, 18)를 음성인식 PCB(40a)와 연결할 시, 하네스 케이블(17b, 18b)이 관통하는 홀의 기능을 한다. 개구부(12e, 12f)는 제1 PCB 수용홈(12a)의 바닥에 형성되는 제1 개구부(12e)와, 제2 PCB 수용홈(12b)의 바닥에 형성되는 제2 개구부(12f)을 포함한다. Openings 12e and 12f are formed at the bottoms of the PCB receiving grooves 12a and 12b. The openings 12e and 12f serve as holes through which the harness cables 17b and 18b pass when the voice input PCBs 17 and 18 are connected to the voice recognition PCB 40a. The openings 12e and 12f include a first opening 12e formed at the bottom of the first PCB receiving groove 12a and a second opening 12f formed at the bottom of the second PCB receiving groove 12b .

개구부(12e, 12f)는 제1 영역(SE1) 및 제2 영역(SE2)에 각각 형성된 슬릿(121a, 121b)의 적어도 일부를 구성한다. 제1 PCB 수용홈(12a)는 제2 영역(SE2)에 형성된 제2 슬릿(121b) 중 우측으로 치우친 위치에 배치되어, 제2 영역(SE2)에 형성된 제3 지지보스(122c) 및 제4 지지보스(122d) 사이에서 제4 지지보스(122d)의 바로 옆에 형성된다. 그리고, 제2 PCB 수용홈(12b)는 제1 영역(SE1)에 형성된 제1 슬릿(121a) 중 좌측으로 치우친 위치에 배치되어, 제1 영역(SE1)에 형성된 제1 지지보스(122a) 및 제2 지지보스(122b) 사이에서 제1 지지보스(122a)의 바로 옆에 형성된다. 따라서, 사용자가 윈도우(11)를 가압할 시 윈도우 지지판(121)이 쉽게 탄성 변형되면서 컨택 스위치를 쉽게 작동시킬 수 있다. The openings 12e and 12f constitute at least a part of the slits 121a and 121b formed in the first area SE1 and the second area SE2, respectively. The first PCB receiving groove 12a is disposed at a position shifted to the right of the second slit 121b formed in the second region SE2 and is connected to the third supporting boss 122c formed in the second region SE2, And is formed on the side of the fourth support boss 122d between the support bosses 122d. The second PCB receiving groove 12b is disposed at a position shifted to the left of the first slit 121a formed in the first region SE1 and has a first supporting boss 122a formed in the first region SE1, And is formed on the side of the first support boss 122a between the second support bosses 122b. Therefore, when the user presses the window 11, the window support plate 121 is easily resiliently deformed, so that the contact switch can be easily operated.

개구부(12e, 12f)는 슬릿(121a, 121b)에 비해 전후방향의 폭이 넓게 형성된다. 음성입력 PCB(17, 18)는 하네스 케이블(17b, 18b)을 통해 음성인식 PCB(40a)와 연결되는 바, 하네스 케이블(17b, 18b)의 하단에는 음성인식 PCB(40a)와 연결되는 커넥터(17c, 18c)가 결합된다. 이 커넥터(17c, 18c)는 개구부(12e, 12f)를 통과하여 윈도우 서포트(12)의 아래로 빠져나가야 하므로, 슬릿(121a, 121b)보다 전후방향의 폭이 넓게 형성되는 것이 바람직하다. The openings 12e and 12f are formed to be wider in the front-back direction than the slits 121a and 121b. The voice input PCBs 17 and 18 are connected to the voice recognition PCB 40a via the harness cables 17b and 18b and connected to the voice recognition PCB 40a at the lower ends of the harness cables 17b and 18b 17c and 18c. The connectors 17c and 18c must pass through the openings 12e and 12f and down the window support 12 so that the connectors 17c and 18c are formed to be wider in the forward and backward direction than the slits 121a and 121b.

하네스 케이블(17b, 18b)은 제1 음성입력 PCB(17) 및 음성인식 PCB(40a)를 연결하는 제1 하네스 케이블(17b)과, 제2 음성입력 PCB(18) 및 음성인식 PCB(40a)를 연결하는 제2 하네스 케이블(18b)를 포함한다. 그리고, 커넥터(17c, 18c)는 제1 하네스 케이블(17b)의 하단에 결합되어 음성인식 PCB(40a)와 연결되는 제1 커넥터(17c)와, 제2 하네스 케이블(18b)의 하단에 결합되어 음성인식 PCB(40a)와 연결되는 제2 커넥터(18c)를 포함한다. The harness cables 17b and 18b include a first harness cable 17b connecting the first audio input PCB 17 and the voice recognition PCB 40a and a second harness cable 17b connecting the second audio input PCB 18 and the voice recognition PCB 40a, And a second harness cable 18b connecting the first harness cable 18a and the second harness cable 18b. The connectors 17c and 18c include a first connector 17c coupled to the lower end of the first harness cable 17b and connected to the voice recognition PCB 40a and a second connector 17c coupled to the lower end of the second harness cable 18b And a second connector 18c connected to the voice recognition PCB 40a.

윈도우(11)에는 윈도우(11)의 상측에서 윈도우(11)의 하측으로 사용자의 음성이 통과하는 음성 통과홀(11c, 11d)이 형성된다. 음성 통과홀(11c, 11d)은 윈도우(11)의 센터를 기준으로 전방 영역에 형성되는 제1 음성 통과홀(11c)과, 윈도우(11)의 센터를 기준으로 후방 영역에 형성되는 제2 음성 통과홀(11d)을 포함한다. 제1 음성 통과홀(11c)은 제1 음성입력 PCB(17)로 음성을 안내하고, 제2 음성 통과홀(11d)은 제2 음성입력 PCB(18)로 음성을 안내한다. The voice passage holes 11c and 11d through which the user's voice passes are formed in the window 11 from the upper side of the window 11 to the lower side of the window 11. [ The voice passing holes 11c and 11d are formed in the first voice passing hole 11c formed in the front region with respect to the center of the window 11 and the second voice passing hole 11c formed in the rear region with respect to the center of the window 11, And includes a through hole 11d. The first voice passage hole 11c guides the voice to the first voice input PCB 17 and the second voice passage hole 11d guides the voice to the second voice input PCB 18. [

실시예에 따라서는, 비대칭 구조에 따라 외관 디자인이 저감될 우려가 있다. 본 실시예의 윈도우(11)는 윈도우(11)의 센터를 기준으로 전방에서 좌우방향으로 제1 음성 통과홀(11c)의 옆에 2개의 제1 데코 홀(11e)이 형성되고, 윈도우(11)의 센터를 기준으로 후방에서 좌우방향으로 제2 음성 통과홀(11d)의 옆에 2개의 제2 데코 홀(11f)이 형성된다. Depending on the embodiment, there is a fear that the appearance design may be reduced depending on the asymmetric structure. In the window 11 of this embodiment, two first decor holes 11e are formed on the side of the first sound passage hole 11c in the front-left and right direction with respect to the center of the window 11, Two second decoupling holes 11f are formed on the side of the second voice passage hole 11d from the rear side to the left and right with respect to the center of the second voice passage hole 11d.

윈도우(11)는 센터부가 투명 영역(11b)으로 형성되고, 투명 영역(11b) 이외의 영역은 불투명 영역(11a)으로 형성된다. In the window 11, the center portion is formed of the transparent region 11b, and the region other than the transparent region 11b is formed of the opaque region 11a.

음성입력 PCB(17, 18)에 사용자의 음성이 쉽게 입력될 수 있도록 하기 위해, 음성 통과홀(11c, 11d)은 음성입력 PCB(17, 18)와 대응되는 위치에 형성되는 것이 바람직하다. 음성입력 PCB(17, 18)에는 음성 통과홀(11c, 11d)과 대응되는 위치에 음성 통과홀(11c, 11d)을 통과한 음성이 입력되는 음성 입력홀(17d, 18d)이 형성된다. 음성입력 PCB(17, 18)에는 음성 입력홀(17d, 18d)이 형성된 하면에 마이크(미도시)가 설치됨이 바람직하다. 상기 마이크는 음성 입력홀(17d, 18d)로 입력된 음성을 증폭하여 음성입력 PCB(17, 18)의 음파변환회로로 입력하는 기능을 한다. 즉, 윈도우(11)의 외부에서 음성 통과홀(11c, 11d)을 통과한 사용자의 음성은 음성입력 PCB(17, 18)에 형성된 음성 입력홀(17d, 18d)로 입력된 후, 상기 마이크에서 증폭되어 음성입력 PCB(17, 18)의 음파변환회로로 입력되어 음성인식 PCB(40a)가 읽을 수 있는 음파신호로 변환된다. It is preferable that the voice passage holes 11c and 11d are formed at positions corresponding to the voice input PCBs 17 and 18 so that the user's voice can be easily inputted to the voice input PCBs 17 and 18. [ The voice input PCBs 17 and 18 are formed with voice input holes 17d and 18d through which the voice passing through the voice pass holes 11c and 11d is inputted at a position corresponding to the voice pass holes 11c and 11d. It is preferable that a microphone (not shown) is provided on the bottom surface of the voice input PCB 17, 18 where the voice input holes 17d, 18d are formed. The microphone amplifies the voice inputted into the voice input holes 17d and 18d and inputs the amplified voice to the sound wave converting circuit of the voice input PCB 17 or 18. That is, the user's voice that has passed through the voice passage holes 11c and 11d from outside the window 11 is input to the voice input holes 17d and 18d formed in the voice input PCBs 17 and 18, Amplified and inputted to the sound wave converting circuit of the sound input PCB 17, 18 and converted into a sound wave signal readable by the voice recognition PCB 40a.

윈도우(11) 및 윈도우 서포트(12) 사이에는 개스킷(17e, 18e)이 설치된다. 개스킷(17e, 18e)은 음성 통과홀(11c, 11d)을 통과한 음성이, 음성입력 PCB(17, 18)의 음성 입력홀(17d, 18d)로 들어가지 않고, 윈도우(11) 및 윈도우 서포트(12) 사이의 틈새로 누설되는 것을 방지한다. 이를 위해, 개스킷(17e, 18e)은 상면이 윈도우(11)의 저면에 밀착되고, 하면은 음성입력 PCB(17, 18)의 상면에 밀착된다. 그리고, 개스킷(17e, 18e)에는 음성 통과홀(11c, 11d) 및 음성 입력홀(17d, 18d)을 연통시키는 연통홀(17f, 18f)이 형성된다. 따라서, 음성 통과홀(11c, 11d)을 통과한 음성은 전부가 연통홀(17f, 18f)을 통해 음성 입력홀(17d, 18d)로 입력되기 때문에, 음성인식을 정확하게 할 수 있게 된다. Between the window 11 and the window support 12, gaskets 17e and 18e are provided. The gaskets 17e and 18e do not enter the audio input holes 17d and 18d of the audio input PCBs 17 and 18 and enter the window 11 and the window support (12). To this end, the gaskets 17e and 18e are brought into close contact with the upper surface of the window 11, and the lower surfaces are brought into close contact with the upper surfaces of the voice input PCBs 17 and 18, respectively. The gaskets 17e and 18e are provided with communication holes 17f and 18f for communicating the voice passage holes 11c and 11d and the voice input holes 17d and 18d. Therefore, since all of the voice that has passed through the voice passage holes 11c and 11d is input to the voice input holes 17d and 18d through the communication holes 17f and 18f, it is possible to accurately recognize the voice.

한편, 윈도우(11), 윈도우 서포트(12) 및 구획판(152)은 전방이 높이가 낮고 후방이 높이가 높게 기울어져서 배치되기 때문에, 커버(10)가 본체(40)의 상부에 결합된 상태일 때, 본체(40)의 상측면 중 수평면으로 형성된 후방부와, 구획판(152) 사이에는 음성인식 PCB(40a)가 설치될 수 있는 공간(S, 도 5 참조.)이 확보된다. Since the window 11, the window support 12 and the partition plate 152 are arranged in such a manner that the height of the front portion is lower and the height of the rear portion is higher than the height of the window 11, A space S (see FIG. 5) in which the voice recognition PCB 40a can be installed is secured between the rear portion of the upper surface of the main body 40, which is formed in a horizontal plane, and the partition plate 152.

도 9를 참조하면, 커버 하우징(15)의 상단 유지부(153)의 전면부에는 좌, 우 양쪽으로 한 쌍의 체결보스(153c, 153d)가 후방으로 함몰 형성될 수 있고, 후면부의 좌, 우 양쪽에는 한 쌍의 체결보스(미도시, 153b)가 전방으로 함몰 형성될 수 있다. 9, a pair of fastening bosses 153c and 153d can be formed in the front portion of the upper end holding portion 153 of the cover housing 15 so as to be recessed rearward in the left and right directions, A pair of fastening bosses (not shown) 153b can be formed to be recessed forward.

도 3 내지 도 9를 참조하면, 커버 하우징(15)의 측벽(151)에는 볼륨 버튼(16)이 설치될 수 있다. 볼륨 버튼(16)은 돔(161)과 탄성 패드(미도시)를 포함할 수 있다. 돔(161)은 재질은 합성수지이며, 일면에 홈이 형성되고, 볼륨증가조작부(161a) 또는 볼륨감소조작부(161b)를 포함할 수 있다. 3 to 9, a volume button 16 may be installed on the side wall 151 of the cover housing 15. [ The volume button 16 may include a dome 161 and an elastic pad (not shown). The dome 161 is made of synthetic resin and has a groove formed on one surface thereof and may include a volume increase operation portion 161a or a volume decrease operation portion 161b.

도 12는 본 발명의 일 실시예에 따른 음성 인식 서버 시스템 및 음성 인식 장치를 포함하는 스마트 홈 시스템을 간략히 도시한 도면이다. 12 is a view schematically illustrating a smart home system including a voice recognition server system and a voice recognition apparatus according to an embodiment of the present invention.

도 12를 참조하면, 본 발명의 일 실시예에 따른 스마트 홈 시스템(10)은, 통신 모듈(미도시)을 구비하여 다른 기기와 통신하거나 네트워크에 접속할 수 있는 음성 인식 장치(1)와 음성 인식 및 가전 제어를 위한 복수의 서버를 포함하는 음성 인식 서버 시스템(1100)을 포함하여 구성될 수 있다. 12, a smart home system 10 according to an embodiment of the present invention includes a voice recognition device 1 having a communication module (not shown) and capable of communicating with other devices or connecting to a network, And a voice recognition server system 1100 including a plurality of servers for home appliance control.

한편, 음성 인식 장치(1)는, 음성 인식이 가능한 장치이다. On the other hand, the speech recognition apparatus 1 is a speech recognition apparatus.

또한, 본 발명의 일 실시예에 따른 스마트 홈 시스템(10)은, 스마트 폰(smart phone), 태블릿(Tablet) PC 등 이동 단말기(미도시)를 포함할 수 있다. In addition, the smart home system 10 according to an embodiment of the present invention may include a mobile terminal (not shown) such as a smart phone, a tablet PC, and the like.

음성 인식 장치(1)는 내부에 통신 모듈을 구비하여 스마트 홈 시스템(10) 내/외부의 전자기기들과 통신할 수 있다. The voice recognition device 1 may include a communication module therein to communicate with electronic devices in the smart home system 10 or outside.

본 발명의 일 실시예에 따른 스마트 홈 시스템(10)은 액세스 포인트(access point: AP) 장치(7)를 더 포함할 수 있고, 음성 인식 장치(1)는 액세스 포인트 장치(7)를 통하여 무선 인터넷 네트워크에 접속하여 다른 기기들과 통신할 수 있다. The smart home system 10 according to an embodiment of the present invention may further include an access point (AP) device 7, It can access the Internet network and communicate with other devices.

액세스 포인트 장치(7)는 스마트 홈 시스템(10) 내의 전자 기기들에, 소정 통신 방식에 의한 무선 채널을 할당하고, 해당 채널을 통해, 무선 데이터 통신을 수행할 수 있다. The access point apparatus 7 can allocate a wireless channel according to a predetermined communication method to the electronic devices in the smart home system 10 and perform wireless data communication through the corresponding channel.

여기서, 소정 통신 방식은, 와이파이(Wi-Fi) 통신 방식일 수 있다. 이에 대응하여, 음성 인식 장치(1)가 구비하는 통신 모듈은 와이파이 통신 모듈일 수 있으나, 본 발명은 통신 방식에 한정되지 않는다. Here, the predetermined communication method may be a Wi-Fi communication method. Correspondingly, the communication module included in the voice recognition device 1 may be a Wi-Fi communication module, but the present invention is not limited to the communication method.

또는, 음성 인식 장치(1)는 다른 종류의 통신 모듈을 구비하거나 복수의 통신 모듈을 구비할 수 있다. 예를 들어, 음성 인식 장치(1)는 NFC 모듈, 지그비(zigbee) 통신 모듈, 블루투스(Bluetooth™) 통신 모듈 등을 포함할 수 있다. Alternatively, the voice recognition device 1 may include other types of communication modules or a plurality of communication modules. For example, the speech recognition device 1 may include an NFC module, a zigbee communication module, a Bluetooth (TM) communication module, and the like.

음성 인식 장치(1)는 와이파이(wi-fi) 통신 모듈 등을 통해 음성 인식 서버 시스템(1100)에 포함되는 서버 또는 외부의 소정 서버, 사용자의 이동 단말기 등과 연결 가능하고, 원격 모니터링, 원격 제어 등 스마트 기능을 지원할 수 있다. The voice recognition device 1 can be connected to a server included in the voice recognition server system 1100 or a predetermined external server or a user's mobile terminal through a wi-fi communication module or the like, Smart function can be supported.

사용자는 이동 단말기를 통하여 스마트 홈 시스템(10) 내의 음성 인식 장치(1)에 관한 정보를 확인하거나 음성 인식 장치(1)를 제어할 수 있다. The user can confirm the information on the voice recognition device 1 in the smart home system 10 or the voice recognition device 1 through the mobile terminal.

한편, 사용자가 가정 내에서 음성 인식 장치(1)를 제어하거나 소정 정보를 확인하고자 하는 경우에도 이동 단말기를 반드시 이용해야 하는 것은 불편할 수 있다. On the other hand, it may be inconvenient for the user to use the mobile terminal even when the user wants to control the voice recognition device 1 or confirm certain information in the home.

예를 들어, 사용자가 이동 단말기의 현재 위치를 모르거나 다른 장소에 있는 경우에 다른 방식으로 음성 인식 장치(1)를 제어할 수 있는 수단이 있는 것이 더 효율적이다. For example, it is more efficient to have a means of controlling the speech recognition device 1 in a different way if the user does not know the current location of the mobile terminal or is in another place.

본 발명의 일 실시예에 따른 음성 인식 장치(1)는 사용자의 음성 입력을 수신할 수 있고, 음성 인식 서버 시스템(1100)은 사용자의 음성 입력을 인식, 분석하여 음성 인식 장치(1)를 제어할 수 있다. The speech recognition apparatus 1 according to an embodiment of the present invention can receive a user's speech input and the speech recognition server system 1100 recognizes and analyzes the speech input of the user to control the speech recognition apparatus 1 can do.

이에 따라, 사용자는 이동 단말기, 원격제어장치를 조작하지 않고서도 음성 인식 장치(1)를 제어할 수 있다. Accordingly, the user can control the voice recognition device 1 without operating the mobile terminal or the remote control device.

한편, 상기 음성 인식 서버 시스템(1100)에 포함되는 서버들 중 적어도 일부는 음성 인식 장치의 제조 회사, 판매 회사가 운영하는 서버이거나 제조 회사 또는 판매 회사가 서비스를 위탁한 회사가 운영하는 서버일 수 있다. At least some of the servers included in the voice recognition server system 1100 may be a server operated by a manufacturer of a voice recognition apparatus or a sales company or a server operated by a company entrusted with a service by a manufacturer or a sales company have.

도 13a는 본 발명의 일 실시예에 따른 음성 인식 서버 시스템의 일예이다. 13A is an example of a speech recognition server system according to an embodiment of the present invention.

도 13a를 참조하면, 본 발명의 일 실시예에 따른 음성 인식 서버 시스템은, 음성 인식 장치(1)로부터 음성 데이터를 수신하고, 수신한 음성 데이터를 분석하여 음성 명령을 판별하는 음성 서버(1110)를 포함할 수 있다. 13A, a speech recognition server system according to an embodiment of the present invention includes a voice server 1110 for receiving voice data from the voice recognition device 1, analyzing the received voice data to discriminate voice commands, . &Lt; / RTI >

음성 서버(1110)는, 음성 인식 장치(1)로부터 음성 데이터를 수신하고, 상기 수신한 음성 데이터를 텍스트(text) 데이터로 변환하며, 텍스트 데이터를 분석하여 음성 명령을 판별할 수 있다. The voice server 1110 receives voice data from the voice recognition device 1, converts the received voice data into text data, and analyzes the text data to determine a voice command.

또한, 음성 서버(1110)는, 판별한 음성 명령에 대응하는 신호를 소정 서버로 송신할 수 있다. Further, the voice server 1110 can transmit a signal corresponding to the voice command discriminated to the predetermined server.

예를 들어, 본 발명의 일 실시예에 따른 음성 인식 서버 시스템은 상기 음성 서버(1110)로부터 상기 판별한 음성 명령에 대응하는 신호를 수신하고, 상기 판별한 음성 명령에 대응하는 요청 신호를 생성하는 연계 서비스 서버(1120)와 상기 연계 서비스 서버(1120)로부터 수신되는 요청 신호에 기초하는 제어 신호를 상기 음성 인식 장치(1)로 송신하는 가전 제어 서버(1130)를 포함할 수 있다. For example, the speech recognition server system according to an embodiment of the present invention receives a signal corresponding to the identified voice command from the voice server 1110, and generates a request signal corresponding to the determined voice command And a home appliance control server 1130 that transmits a control signal based on a request signal received from the linkage service server 1120 and the linkage service server 1120 to the voice recognition device 1. [

상기 음성 인식 장치(1)는 사용자가 발화한 음성 명령 입력을 수신하여 수신한 음성 명령 입력에 기초한 음성 데이터를 상기 음성 서버(1110)로 송신할 수 있다. The voice recognition apparatus 1 can receive voice command input by the user and transmit voice data based on the received voice command input to the voice server 1110. [

상기 음성 서버(1110)는, 음성 인식 장치(1)로부터 음성 데이터를 수신하고, 수신한 음성 데이터를 텍스트(text) 데이터로 변환하는 자동 음성 인식(Automatic Speech Recognition: ASR) 서버(1111), 상기 자동 음성 인식 서버(1111)로부터 상기 텍스트 데이터를 수신하고, 수신한 텍스트 데이터를 분석하여 음성 명령을 판별하며, 상기 판별한 음성 명령에 기초하는 응답 신호를 상기 음성 인식 장치(1)로 송신하는 자연어 처리(Natural Language Processing: NLP) 서버(1112), 및, 상기 음성 인식 장치(1)로부터 상기 응답 신호에 대응하는 텍스트를 포함하는 신호를 수신하고, 수신한 신호에 포함되는 텍스트를 음성 데이터로 변환하여 상기 음성 인식 장치(1)로 송신하는 텍스트 음성 변환(Text to Speech: TTS) 서버(1113)를 포함할 수 있다. The voice server 1110 includes an Automatic Speech Recognition (ASR) server 1111 that receives voice data from the voice recognition device 1 and converts the received voice data into text data, The speech recognition apparatus 1 receives the text data from the automatic speech recognition server 1111, analyzes the received text data to determine a voice command, and transmits a response signal based on the determined voice command to the voice recognition apparatus 1 (NLP) server 1112, and a signal including text corresponding to the response signal from the speech recognition apparatus 1, converts the text included in the received signal into speech data And a text to speech (TTS) server 1113 for transmitting the voice message to the voice recognition apparatus 1. [

상기 자동 음성 인식 서버(1111)는 음성 인식 장치(1)로부터 수신한 음성 데이터에 대하여 음성 인식을 수행하여 텍스트 데이터를 생성하여 상기 자연어 처리 서버(1112)로 송신할 수 있다. The automatic speech recognition server 1111 may perform speech recognition on the speech data received from the speech recognition apparatus 1 to generate text data and transmit the text data to the natural language processing server 1112. [

상기 자연어 처리 서버(1112)는 상기 자동 음성 인식 서버(1111)로부터 수신한 텍스트 데이터를 자연어 처리 알고리즘에 따라 분석하여 음성 명령을 판별할 수 있다. The natural language processing server 1112 can analyze the text data received from the automatic speech recognition server 1111 according to a natural language processing algorithm to discriminate a voice command.

상기 자연어 처리 서버(1112)는 자연어 처리 알고리즘에 따라 사람이 일상적으로 사용하고 있는 언어인 자연어를 처리할 수 있고, 사용자의 의도(intent)를 분석할 수 있다. 상기 자연어 처리 서버(1112)는 상기 자동 음성 인식 서버(1111)로부터 수신한 텍스트 데이터에 대하여 자연어 처리를 수행하여 사용자의 의도에 부합하는 음성 명령을 판별할 수 있다. The natural language processing server 1112 can process a natural language that is a language that a person routinely uses according to a natural language processing algorithm, and can analyze an intent of the user. The natural language processing server 1112 can perform natural language processing on the text data received from the automatic speech recognition server 1111 to determine a voice command that matches the user's intention.

이에 따라, 상기 자연어 처리 서버(1112)는 사용자가 일상적인 사용 언어로 음성 명령을 입력하더라도 사용자의 의도에 부합하는 음성 명령을 판별할 수 있다. Accordingly, the natural language processing server 1112 can determine a voice command that matches the user's intention even if the user inputs a voice command in a normal use language.

상기 자연어 처리 서버(1112)는 자연어 처리 결과에 대응하는 신호, 즉, 판별한 음성 명령에 대응하는 신호를 상기 연계 서비스 서버(1120)로 송신할 수 있다. The natural language processing server 1112 can transmit a signal corresponding to the natural language processing result, that is, a signal corresponding to the discriminated voice command to the linked service server 1120. [

상기 연계 서비스 서버(1120)는, 상기 자연어 처리 서버(1112)로부터 상기 판별한 음성 명령에 대응하는 신호를 수신할 수 있다. The connection service server 1120 can receive a signal corresponding to the voice command determined by the natural language processing server 1112. [

상기 연계 서비스 서버(1120)는 판별한 음성 명령이 음성 인식 장치(1)에 관한 것이면, 상기 가전 제어 서버(1130)와 통신하여 대응하는 동작을 수행할 수 있다. The connection service server 1120 can perform the corresponding operation by communicating with the home appliance control server 1130 if the voice command is related to the voice recognition device 1. [

또는, 상기 연계 서비스 서버(1120)는 판별한 음성 명령이 음성 인식 장치(1)에 관한 것이 아니면, 외부의 외부 서비스(1121)와 통신하여 대응하는 동작을 수행할 수 있다. Alternatively, if the determined voice command is not related to the voice recognition device 1, the connection service server 1120 can communicate with the external service 1121 to perform a corresponding operation.

예를 들어, 상기 연계 서비스 서버(1120)는 판별한 음성 명령이 날씨, 주식, 뉴스 등의 정보를 요청하는 명령이면, 요청된 정보에 대응하는 서비스를 제공하는 서버로 해당 정보를 요청하고 수신할 수 있다. For example, if the determined voice command is a command for requesting information such as weather, stock, news, etc., the connection service server 1120 requests and receives the information from the server providing the service corresponding to the requested information .

또한, 상기 연계 서비스 서버(1120)는 수신한 정보를 음성 서버(1110)로 송신할 수 있고, 상기 자연어 처리 서버(1112)는 수신한 정보를 음성 인식 장치(1)로 전달할 수 있다. The connection service server 1120 can transmit the received information to the voice server 1110 and the natural language processing server 1112 can transmit the received information to the voice recognition device 1. [

상기 연계 서비스 서버(1120)는 판별한 음성 명령이 음성 인식 장치(1)에 관한 것이면, 상기 판별한 음성 명령에 대응하는 요청 신호를 생성하여 상기 가전 제어 서버(1130)로 송신할 수 있다. The connection service server 1120 may generate a request signal corresponding to the identified voice command and transmit the request signal to the home appliance control server 1130 if the voice command is related to the voice recognition device 1. [

상기 가전 제어 서버(1130)는, 상기 연계 서비스 서버(1120)로부터 수신되는 요청 신호에 기초하는 제어 신호를 상기 음성 인식 장치(1)로 송신할 수 있다. The home appliance control server 1130 can transmit a control signal based on the request signal received from the linked service server 1120 to the voice recognition device 1. [

예를 들어, 음성 인식 장치(1)의 음악 재생 요청이 수신되면, 상기 가전 제어 서버(1130)는 상기 음성 인식 장치(1)로 음악 재생을 위한 제어 신호를 송신할 수 있다. For example, when the music reproduction request of the speech recognition apparatus 1 is received, the household control server 1130 may transmit a control signal for music reproduction to the speech recognition apparatus 1. [

한편, 음성 인식 장치(1)는 상기 가전 제어 서버(1130)로부터 수신한 제어 신호에 따라 대응하는 동작을 수행할 수 있다. On the other hand, the speech recognition apparatus 1 can perform a corresponding operation according to the control signal received from the home appliance control server 1130.

또한, 음성 인식 장치(1)는 요청받은 동작을 수행한 후 동작을 수행하였음을 알리는 신호를 상기 가전 제어 서버(1130)로 송신할 수 있다. Also, the speech recognition apparatus 1 may transmit a signal to the home appliance control server 1130 indicating that the operation is performed after performing the requested operation.

또한, 상기 가전 제어 서버(1130)는, 상기 음성 인식 장치(1)로부터 상기 제어 신호에 대한 응답 신호를 수신하고, 상기 응답 신호에 대응하는 처리 결과 정보를 상기 연계 서비스 서버(1120)로 송신할 수 있다. The home appliance control server 1130 receives a response signal to the control signal from the speech recognition device 1 and transmits processing result information corresponding to the response signal to the linkage service server 1120 .

상기 음성 서버(1110)는, 상기 처리 결과 정보를 포함하는 응답 신호를 상기 음성 인식 장치(1)로 송신할 수 있다. The voice server 1110 can transmit a response signal including the processing result information to the voice recognition device 1. [

또한, 상기 음성 서버(1110)는, 상기 음성 인식 장치(1)로부터 상기 처리 결과 정보에 대응하는 출력 문구 텍스트를 포함하는 신호를 수신하고, 상기 수신한 출력 문구 텍스트를 음성 데이터로 변환하여 상기 음성 인식 장치(1)로 전송할 수 있다. The voice server 1110 receives a signal including an output text corresponding to the processing result information from the voice recognition device 1, converts the received output text into voice data, To the recognizing device (1).

이 경우에, 상기 자연어 처리 서버(1112)가 상기 음성 인식 장치(1)로 송신하는 상기 판별한 음성 명령에 기초하는 응답 신호는, 상기 처리 결과 정보를 포함할 수 있다. In this case, the response signal based on the voice command determined by the natural language processing server 1112 to be transmitted to the voice recognition device 1 may include the processing result information.

한편, 음성 인식 장치(1)는 상기 자연어 처리 서버(1112)로부터 상기 판별한 음성 명령에 기초하는 응답 신호를 수신할 수 있다. 여기서, 응답 신호를 상기 판별한 음성 명령에 대응하는 응답의 텍스트 데이터를 포함할 수 있다. On the other hand, the speech recognition apparatus 1 can receive a response signal based on the determined voice command from the natural language processing server 1112. [ Here, the response signal may include text data of a response corresponding to the identified voice command.

예를 들어, 사용자가 음악 재생을 요청하는 음성 명령을 입력한 경우에, 상기 응답 신호는 음악을 재생하였음을 나타내는 텍스트 데이터를 포함할 수 있다. For example, when the user inputs a voice command for requesting music reproduction, the response signal may include text data indicating that the music has been reproduced.

한편, 음성 인식 장치(1)는 수신한 응답 신호에 대응하는 텍스트를 포함하는 신호를 상기 텍스트 음성 변환 서버(1113)로 송신할 수 있다. 여기서, 상기 응답 신호에 대응하는 텍스트를 포함하는 신호는 상기 처리 결과 정보에 대응하는 출력 문구 텍스트를 포함할 수 있다. On the other hand, the speech recognition apparatus 1 can transmit a signal including the text corresponding to the received response signal to the text-to-speech conversion server 1113. Here, the signal including the text corresponding to the response signal may include an output text corresponding to the processing result information.

한편, 상기 텍스트 음성 변환 서버(1113)는 수신한 신호에 포함되는 텍스트를 음성 데이터로 변환하여 상기 음성 인식 장치(1)로 송신할 수 있다. 여기서 변환된 음성 데이터는 음원 파일을 포함할 수 있다. On the other hand, the text-to-speech conversion server 1113 can convert the text included in the received signal into voice data and transmit the voice data to the voice recognition device 1. [ The converted voice data may include a sound source file.

상기 음성 인식 장치(1)는 스피커를 통하여 수신한 음성 데이터에 기초한 음성 안내 메시지를 출력할 수 있다. The voice recognition apparatus 1 can output a voice guidance message based on voice data received through a speaker.

한편, 상기 연계 서비스 서버(1120)는, 상기 판별한 음성 명령에 대응하는 신호에 기초하여 상기 음성 인식 장치(1)의 상태 정보를 상기 가전 제어 서버(1130)로 요청하고, 상기 가전 제어 서버(1130)는, 상기 음성 인식 장치(1)의 상태 정보를 상기 연계 서비스 서버(1120)로 송신할 수 있다. 상기 가전 제어 서버(1130)는, 상기 음성 인식 장치(1)의 상태 정보가 확보되어 있는 상태가 아닌 경우에는, 상기 음성 인식 장치(1)로 상태 정보를 요청하여 수신할 수 있다. On the other hand, the connection service server 1120 requests the home appliance control server 1130 of the state information of the voice recognition device 1 based on the signal corresponding to the determined voice command, 1130 may transmit the status information of the voice recognition device 1 to the linked service server 1120. [ When the state information of the voice recognition device 1 is not secured, the home appliance control server 1130 can request and receive the state information from the voice recognition device 1. [

한편, 상기 연계 서비스 서버(1120)는, 상기 음성 인식 장치(1)의 상태 정보에 기초하여 상기 판별한 음성 명령의 지원이 가능한 경우에, 상기 가전 제어 서버(1130)로 상기 판별한 음성 명령에 대응하는 요청 신호를 송신할 수 있다. On the other hand, when it is possible to support the determined voice command based on the state information of the voice recognition device 1, the connection service server 1120 transmits the voice command to the home appliance control server 1130 It is possible to transmit a corresponding request signal.

또는, 상기 연계 서비스 서버(1120)는, 상기 음성 인식 장치(1)의 상태 정보에 기초하여 상기 판별한 음성 명령의 지원이 불가능한 경우에, 상기 자연어 처리 서버(1112)로 현재 상태에서 지원되지 않는 기능임을 알리는 신호를 송신할 수 있다. Alternatively, when it is impossible to support the determined voice command based on the status information of the voice recognition apparatus 1, the connection service server 1120 transmits the voice command to the natural language processing server 1112, It is possible to transmit a signal indicating the function.

이 경우에도, 음성 인식 장치(1)는 상기 텍스트 음성 변환 서버(1113)로 음성 데이터를 요청, 수신하여, 현재 상태에서 지원되지 않는 기능임을 알리는 음성 안내 메시지를 출력할 수 있다. In this case, the speech recognition apparatus 1 can request and receive the voice data to the text-to-speech conversion server 1113, and output a voice guidance message indicating that the voice data is not supported in the current state.

실시예에 따라서는, 상기 음성 서버(1110)가, 상기 판별한 음성 명령의 지원 가능 유무를 판별할 수 있다. 예를 들어, 사용자의 음성 명령의 의도를 분석한 상기 자연어 처리 서버(1112)가, 상기 판별한 음성 명령의 지원 가능 유무를 판별할 수 있다. According to an embodiment, the voice server 1110 can determine whether or not the identified voice command can be supported. For example, the natural language processing server 1112 that has analyzed the intention of the user's voice command can determine whether or not the determined voice command can be supported.

이 경우에, 상기 판별한 음성 명령이 지원 불가능한 명령을 포함하고 있는 경우, 상기 자연어 처리 서버(1112)가 송신하는 상기 판별한 음성 명령에 기초하는 응답 신호는 상기 판별한 음성 명령이 상기 음성 인식 장치(1)가 지원하지 않는 기능임을 알리는 신호일 수 있다. In this case, when the determined voice command includes an instruction that can not be supported, the response signal based on the determined voice command transmitted by the natural language processing server 1112 is a voice command, May be a signal indicating that the function (1) is not supported.

본 발명의 일 실시예에 따른 음성 서버(1110) 및 이를 포함하는 음성 인식 서버 시스템(1100)은, 자연어 음성처리를 위해 여러 역할을 하는 서버를 유기적으로 연결하여 이용할 수 있다. The voice server 1110 and the voice recognition server system 1100 including the voice server 1100 according to an embodiment of the present invention can organically connect servers having various roles for natural voice processing.

음성 인식 장치(1)는 음성 명령의 수신 및 전처리, 서버 전송까지의 동작을 수행하고, 음성 서버(1110)는, 음성/텍스트 변환, 의도분석, 명령 식별 등 자연어 처리 과정을 수행할 수 있다. The voice recognition apparatus 1 performs operations from reception of voice commands to preprocessing and server transmission, and the voice server 1110 can perform natural language processing such as voice / text conversion, intent analysis, and command identification.

음성 서버(1110)가 자연어 처리를 수행함으로써, 음성 인식 장치 내부 임베디드 모듈의 CPU, 메모리 등의 부담을 감소시킬 수 있다. By performing the natural language processing by the voice server 1110, it is possible to reduce the burden on the CPU, memory, and the like of the embedded module in the voice recognition apparatus.

한편, 연계 서비스 서버(1120)는 외부 서비스 및 가전 제어 서버(1130)와 통신하여 사용자의 음성 명령에 기초한 동작을 수행할 수 있다. Meanwhile, the connection service server 1120 can communicate with the external service and home appliance control server 1130 to perform operations based on the voice command of the user.

한편, 음성 인식 장치(1)는 음성 서버(1110)로부터 음원 파일을 포함하는 음성 데이터를 수신하여, 음성 안내 메시지를 오디오로 출력함으로써, 청각적 피드백으로 사용자의 음성 입력에 화답할 수 있다. On the other hand, the voice recognition device 1 can receive voice data including a sound source file from the voice server 1110 and output a voice guidance message to the voice, so that the voice recognition device 1 can respond to voice input of the user through auditory feedback.

음성 인식 장치(1)는 음성 서버(1110)로부터 음성 파일을 스트리밍으로 전달 받아 사용자에게 음성 안내 메시지를 재생, 출력할 수 있다. 이에 따라 음성 인식 장치(1)는 다양한 음원 파일을 저장하고 있을 필요가 없다. The voice recognition apparatus 1 receives the voice file from the voice server 1110 through streaming, and can reproduce and output the voice guidance message to the user. Accordingly, the speech recognition apparatus 1 need not store various sound source files.

한편, 연계 서비스 서버(1120)를 통하여 다른 서버와 충돌하지 않으면서도 다양한 외부 서비스들과의 연계가 가능하다. 또한, 외부 서비스 연동 서버를 통해 의도분석 시 외부 정보를 반영하여 의도분석 성공률을 높일 수 있다. Meanwhile, it is possible to link with various external services without colliding with other servers through the linked service server 1120. In addition, through the external service interworking server, it is possible to increase the success rate of the intention analysis by reflecting external information in the intention analysis.

본 발명의 일 실시예에 따른 음성 인식 서버 시스템(1100)은, 복수의 서버를 통해, 호환성 및 연결성을 확보하고, 최종 제어 명령은 가전 제어 서버(1130)를 활용함으로써, 음성 인식 과정과 가전 제어 서버(1130)를 통해 와이파이 통신을 이용한 가전 제어 사이의 충돌 및 이동 단말기를 통한 가전 제어와 음성 인식 장치(1)를 통한 음성 입력에 의한 가전 제어 간 충돌을 막을 수 있다. The voice recognition server system 1100 according to an embodiment of the present invention assures compatibility and connectivity through a plurality of servers and utilizes the home appliance control server 1130 as a final control command, The collision between the home appliance control using the Wi-Fi communication and the home appliance control through the mobile terminal and the home appliance control by the voice input through the voice recognition device 1 can be prevented through the server 1130. [

본 발명의 일 실시예에 따른 음성 인식 서버 시스템(1100)은, 서버 간 유기적인 연결을 통해 어느 한 서버에 의존하여 특정서버에 부하가 몰리는 것을 줄일 수 있고, 각각의 서버 별 역할이 달라, 특정 서버에서 문제가 생길 경우 동일한 역할을 하는 타 서버와의 연계를 통해 쉽게 대응이 가능하다. The speech recognition server system 1100 according to an embodiment of the present invention can reduce a load on a specific server depending on a certain server through an organic connection between the servers, If there is a problem with the server, it can be easily responded to by linking with other servers having the same role.

또한, 복수의 서버들을 독립적으로 수시로 업데이트할 수 있어, 성능 개선에 유리하다. In addition, a plurality of servers can be independently updated from time to time, which is advantageous for performance improvement.

도 13b는 본 발명의 일 실시예에 따른 음성 인식 서버 시스템의 일예이다. 13B is an example of a speech recognition server system according to an embodiment of the present invention.

도 13b에서 예시된 음성 인식 서버 시스템은, 도 13a에서 예시된 음성 인식 서버 시스템이 음성 안내 메시지 출력을 위한 음성 데이터가 음성 인식 장치(1)로 전송되는 과정을 개선하여 음성 제어 응답 시간을 향상한 것이다. The voice recognition server system illustrated in FIG. 13B improves the process of transmitting voice data for voice announcement message output to the voice recognition apparatus 1 by the voice recognition server system illustrated in FIG. 13A to improve the voice control response time will be.

따라서, 도 13a와 도 13b에서 예시된 음성 인식 서버 시스템은 상기 차이점 이외의 동작은 실질적으로 동일하게 수행할 수 있고, 이하에서는 동일한 부분에 대해서는 간략히 기술한다. Therefore, the speech recognition server system illustrated in FIGS. 13A and 13B can perform substantially the same operations other than the above-described differences, and the same portions will be briefly described below.

도 13b를 참조하면, 본 발명의 일 실시예에 따른 음성 인식 서버 시스템은, 음성 인식 장치(1)로부터 음성 데이터를 수신하고, 수신한 음성 데이터를 분석하여 음성 명령을 판별하는 음성 서버(1110)를 포함할 수 있다. 13B, a speech recognition server system according to an embodiment of the present invention includes a voice server 1110 for receiving voice data from the voice recognition device 1, analyzing the received voice data to discriminate voice commands, . &Lt; / RTI >

또한, 본 발명의 일 실시예에 따른 음성 인식 서버 시스템은, 상기 음성 서버(1110)로부터 상기 판별한 음성 명령에 대응하는 신호를 수신하고, 상기 판별한 음성 명령에 대응하는 요청 신호를 생성하는 연계 서비스 서버(1120)와 상기 연계 서비스 서버(1120)로부터 수신되는 요청 신호에 기초하는 제어 신호를 상기 음성 인식 장치(1)로 송신하는 가전 제어 서버(1130)를 더 포함할 수 있다. In addition, the voice recognition server system according to an embodiment of the present invention may include a voice recognition server 1110 for receiving a signal corresponding to the voice command discriminated from the voice server 1110 and generating a request signal corresponding to the discriminated voice command And a home appliance control server 1130 that transmits a control signal based on a request signal received from the service server 1120 and the connection service server 1120 to the voice recognition device 1. [

도 13b에서 예시된 음성 인식 서버 시스템은, 음성 인식 장치(1)의 요청 없이도, 상기 음성 서버(1110)가 상기 음성 명령에 기초하는 처리 결과 정보를 포함하는 음성 데이터를 상기 음성 인식 장치(1)로 송신할 수 있다. The speech recognition server system illustrated in FIG. 13B is a system in which the speech server 1110 transmits speech data including processing result information based on the speech command to the speech recognition apparatus 1, without requesting the speech recognition apparatus 1. [ As shown in FIG.

상기 음성 서버(1110)는, 음성 인식 장치(1)로부터 음성 데이터를 수신하고, 수신한 음성 데이터를 텍스트(text) 데이터로 변환하는 자동 음성 인식 서버(1111), 상기 자동 음성 인식 서버(1111)로부터 상기 텍스트 데이터를 수신하고, 수신한 텍스트 데이터를 분석하여 음성 명령을 판별하는 자연어 처리 서버(1112), 및, 상기 음성 명령에 기초하는 응답 신호를 음성 데이터로 변환하여 상기 음성 인식 장치(1)로 송신하는 텍스트 음성 변환 서버(1113)를 포함할 수 있다. The voice server 1110 includes an automatic voice recognition server 1111 that receives voice data from the voice recognition device 1 and converts the received voice data into text data, A natural language processing server 1112 for receiving the text data from the voice recognition device 1112 and analyzing the received text data to determine a voice command, and a speech recognition server 1112 for converting the response signal based on the voice command into voice data, And a text-to-speech conversion server 1113 for transmitting the text-

본 실시예에서도, 상기 가전 제어 서버(1130)는, 상기 음성 인식 장치(1)로부터 상기 제어 신호에 대한 응답 신호를 수신하고, 상기 응답 신호에 대응하는 처리 결과 정보를 상기 연계 서비스 서버(1120)로 송신할 수 있다. Also in this embodiment, the household appliance control server 1130 receives a response signal to the control signal from the speech recognition device 1, and transmits processing result information corresponding to the response signal to the connection service server 1120, As shown in FIG.

상기 연계 서비스 서버(1120)는 음성 서버(1110), 더욱 상세하게는 자연어 처리 서버(1112)로 상기 처리 결과 정보를 전달할 수 있다. The linked service server 1120 can transmit the processing result information to the voice server 1110, and more specifically, to the natural language processing server 1112.

이 경우에, 상기 텍스트 음성 변환 서버(1113)가 상기 음성 인식 장치(1)로 송신하는 음성 데이터는, 상기 처리 결과 정보를 포함할 수 있다. In this case, the speech data transmitted from the text-to-speech conversion server 1113 to the speech recognition apparatus 1 may include the processing result information.

또한, 상기 연계 서비스 서버(1120)는, 상기 판별한 음성 명령에 대응하는 신호에 기초하여 상기 음성 인식 장치(1)의 상태 정보를 상기 가전 제어 서버로 요청하고, 상기 가전 제어 서버는, 상기 음성 인식 장치(1)의 상태 정보를 상기 연계 서비스 서버(1120)로 송신할 수 있다. The connection service server 1120 also requests the appliance control server for the status information of the voice recognition apparatus 1 based on the signal corresponding to the voice command discriminated, And can transmit the status information of the recognition device 1 to the linked service server 1120.

또한, 상기 연계 서비스 서버(1120)는, 상기 음성 인식 장치(1)의 상태 정보에 기초하여 상기 판별한 음성 명령의 지원이 가능한 경우에, 상기 가전 제어 서버(1130)로 상기 판별한 음성 명령에 대응하는 요청 신호를 송신할 수 있다. When the determined voice command can be supported on the basis of the state information of the voice recognition device 1, the connection service server 1120 transmits the voice command to the home appliance control server 1130 It is possible to transmit a corresponding request signal.

또는, 상기 연계 서비스 서버(1120)는, 상기 음성 인식 장치(1)의 상태 정보에 기초하여 상기 판별한 음성 명령의 지원이 불가능한 경우에, 상기 음성 서버(1110)로 현재 상태에서 지원되지 않는 기능임을 알리는 신호를 송신할 수 있다. Alternatively, when it is impossible to support the determined voice command based on the status information of the voice recognition device 1, the connection service server 1120 transmits a voice message to the voice server 1110, A signal indicating that the signal is transmitted can be transmitted.

예를 들어, 상기 연계 서비스 서버(1120)는 상기 자연어 처리 서버(1112)로 현재 상태에서 지원되지 않는 기능임을 알리는 신호를 송신할 수 있다. For example, the connection service server 1120 can transmit a signal to the natural language processing server 1112 to inform the natural language processing server 1112 of the unsupported function in the current state.

또한, 상기 자연어 처리 서버(1112)는 상기 텍스트 음성 변환 서버(1113)로 현재 상태에서 지원되지 않는 기능임을 알리는 신호를 전달하고, 상기 텍스트 음성 변환 서버(1113)는 대응하는 음성 데이터를 생성하여 음성 인식 장치(1)로 송신할 수 있다. In addition, the natural language processing server 1112 transmits a signal indicating that it is an unsupported function in the current state to the text-to-speech conversion server 1113, and the text-to-speech conversion server 1113 generates corresponding voice data, It is possible to transmit it to the recognition device 1. [

음성 인식 장치(1)는 상기 텍스트 음성 변환 서버(1113)로부터 음성 데이터를 수신하여, 현재 상태에서 지원되지 않는 기능임을 알리는 음성 안내 메시지를 출력할 수 있다. The speech recognition apparatus 1 may receive the voice data from the text-to-speech conversion server 1113 and output a voice guidance message indicating that the voice data is not supported in the current state.

음성 인식 장치(1)가 동작한 후 제일 마지막으로 음성 안내를 위한 텍스트 음성 변환 서버(1113)에 해당 동작에 대한 안내 멘트를 요청할 경우 동작과 안내 멘트 사이에 시간 차가 발생할 수 있다. When a speech announcement about the corresponding operation is requested to the text-to-speech conversion server 1113 for speech guidance at the earliest after the speech recognition apparatus 1 operates, a time difference may occur between the operation and the announcement.

하지만, 본 발명의 일 실시예에 따르면, 의도분석이 완료되어 동작 요청이 자연어 처리 서버(1112)에서 송신될 때, 이와 동시에 텍스트 음성 변환 서버(1113)로 정보를 제공해줄 수 있다. However, according to the embodiment of the present invention, when the intention analysis is completed and the operation request is transmitted from the natural language processing server 1112, the text voice conversion server 1113 can provide the information to the text voice conversion server 1113 at the same time.

또한, 가전 제어 서버(1130)에서 음성 인식 장치(1)에 제어 명령을 내리는 시점에 준하여 텍스트 음성 변환 서버(1113)에서 음성 인식 장치(1)로 안내멘트를 제공할 수 있다. In addition, the text-to-speech conversion server 1113 can provide the announcement to the speech recognition device 1 on the basis of a time point when the home appliance control server 1130 issues a control command to the speech recognition device 1. [

이에 따라, 음성 인식 장치(1) 동작과 동시 또는 바로 후 안내멘트가 발화될 수 있다. Accordingly, the announcement can be made at the same time or immediately after the operation of the speech recognition apparatus 1.

본 실시예에 따르면, 자연어 처리 서버(1112)와 텍스트 음성 변환 서버(1113)를 바로 연결함으로써, 가전 제어 서버(1130)를 통한 제어명령과 안내멘트 사이의 시간차를 최소화할 수 있다. According to the present embodiment, by directly connecting the natural language processing server 1112 and the text-to-speech conversion server 1113, the time difference between the control command and the announcement via the home appliance control server 1130 can be minimized.

한편, 도 13a와 도 13b에서는 음성 인식 장치(1)로 허브 기능을 수행하는 음성 인식 장치를 예시하여 설명하였지만, 본 발명은 이에 한정되지 않는다. 예를 들어, 음성 인식 장치(1)는, 공기조화기 외에 로봇청소기, 냉장고, 세탁기, 조리기기, TV, 이동 단말기(스마트 폰, 웨어러블 기기), 차량, 조명 장치, 온도 조절 장치 등에 채용될 수 있다. 13A and 13B, the speech recognition apparatus performing the hub function by speech recognition apparatus 1 has been exemplified and explained, but the present invention is not limited thereto. For example, the speech recognition device 1 can be employed in a robot cleaner, a refrigerator, a washing machine, a cooking device, a TV, a mobile terminal (smart phone, wearable device), a vehicle, a lighting device, have.

한편, 본 발명의 일 측에 따르면, 도 13a와 도 13b과 달리, 음성 인식 및 처리를 위한 자동 음성 인식 서버(1111), 자연어 처리 서버(1112), 텍스트 음성 변환 서버(1113)는 하나의 통합 서버로 구성될 수 있다. 13A and 13B, the automatic speech recognition server 1111, the natural language processing server 1112, and the text-to-speech conversion server 1113 for speech recognition and processing are integrated Server.

또한, 실시예에 따라서는, 연계 서비스 서버(1120)와 가전 제어 서버(1130)는 하나의 통합 서버로 구성될 수 있다. In addition, according to an embodiment, the linkage service server 1120 and the home appliance control server 1130 may be configured as one integrated server.

본 발명에 따르면, 음성 입력에 따라 음성 인식 장치가 동작함으로써, 사용자가 리모콘 등 원격제어장치, 이동 단말기 등을 조작할 필요가 없어, 사용자 편의성을 증대시킬 수 있다. According to the present invention, since the voice recognition device operates according to the voice input, the user does not need to operate the remote control device such as the remote controller, the mobile terminal, and the like, and the user convenience can be increased.

또한, 도 13a와 도 13b를 참조하여 설명한 것과 같이, 본 발명은 복수의 서버를 이용하여, 사용자의 자연어 음성 명령을 인식하고, 대응하는 제어 동작을 수행함으로써, 음성 인식 장치, 각 서버의 시스템 자원에 제한되지 않고 효율적으로 자연어를 인식, 처리할 수 있다. 13A and 13B, the present invention recognizes a user's natural language voice command using a plurality of servers and performs a corresponding control operation, thereby enabling a voice recognition apparatus, a system resource of each server It is possible to efficiently recognize and process natural language.

도 14는 본 발명의 일 실시예에 따른 음성 인식 장치의 내부 블록도의 일예를 도시한 도면이다. FIG. 14 is a block diagram of an internal block diagram of a speech recognition apparatus according to an embodiment of the present invention. Referring to FIG.

도 14를 참조하면, 본 발명의 일 실시예에 따른 음성 인식 장치(1)는, 사용자의 음성 명령을 수신하는 오디오 입력부(220), 각종 데이터를 저장하는 메모리(250), 다른 전자기기와 무선 통신하는 통신 모듈(50), 소정 정보를 영상으로 표시하거나 오디오로 출력하는 출력부(290), 및, 전반적인 동작을 제어하는 제어부(240)를 포함할 수 있다. 14, a speech recognition apparatus 1 according to an embodiment of the present invention includes an audio input unit 220 for receiving a voice command of a user, a memory 250 for storing various data, A communication module 50 for communication, an output unit 290 for displaying predetermined information as an image or outputting audio information, and a control unit 240 for controlling overall operation.

오디오 입력부(220)는, 외부의 오디오 신호, 사용자 음성 명령을 입력받을 수 있다. 이를 위해, 오디오 입력부(220)는, 하나 이상의 마이크(MIC)를 구비할 수 있다. 또한, 사용자의 음성 명령을 더 정확히 수신하기 위하여 오디오 입력부(220)는 복수의 마이크(221, 222)를 구비할 수 있다. 복수의 마이크(221, 222)는, 서로 다른 위치에 이격되어 배치될 수 있고, 외부의 오디오 신호를 획득하여 전기적인 신호로 처리할 수 있다. The audio input unit 220 can receive external audio signals and user voice commands. To this end, the audio input unit 220 may include one or more microphones (MICs). In addition, the audio input unit 220 may include a plurality of microphones 221 and 222 in order to more accurately receive a user's voice command. The plurality of microphones 221 and 222 can be disposed at different positions and can acquire an external audio signal and process it as an electrical signal.

도 14 등에서는 오디오 입력부(220)가 제1 마이크(221)와 제2 마이크(222)의 2개의 마이크를 구비하는 예를 도시하였으나, 본 발명은 이에 한정되지 않는다. 14 illustrates an example in which the audio input unit 220 includes two microphones including a first microphone 221 and a second microphone 222, the present invention is not limited thereto.

오디오 입력부(220)는 아날로그 소리를 디지털 데이터로 변환하는 처리부를 포함하거나 처리부에 연결되어 사용자 입력 음성 명령을 제어부(240) 또는 소정 서버에서 인식할 수 있도록 데이터화할 수 있다. The audio input unit 220 may include a processor for converting analog sound into digital data, or may be connected to a processing unit to convert user input voice commands into data to be recognized by the controller 240 or a predetermined server.

한편, 오디오 입력부(220)는 사용자의 음성 명령을 입력받는 과정에서 발생하는 잡음(noise)을 제거하기 위한 다양한 잡음 제거 알고리즘이 사용될 수 있다. Meanwhile, the audio input unit 220 may use various noise reduction algorithms for eliminating noise generated in the process of receiving a voice command of a user.

또한, 오디오 입력부(220)는 각 마이크(221, 222)에서 수신되는 오디오 신호에서 노이즈를 제거하는 필터, 필터에서 출력되는 신호를 증폭하여 출력하는 증폭기 등 오디오 신호 처리를 위한 구성들을 포함할 수 있다. The audio input unit 220 may include a filter for removing noise from an audio signal received from each of the microphones 221 and 222 and an amplifier for amplifying and outputting a signal output from the filter, .

메모리(250)는 음성 인식 장치(1)의 동작에 필요한 각종 정보들을 기록하는 것으로, 휘발성 또는 비휘발성 기록 매체를 포함할 수 있다. 기록 매체는 마이크로 프로세서(micro processor)에 의해 읽힐 수 있는 데이터를 저장한 것으로, HDD(Hard Disk Drive), SSD(Solid State Disk), SDD(Silicon Disk Drive), ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등을 포함할 수 있다. The memory 250 records various information necessary for the operation of the voice recognition device 1, and may include a volatile or nonvolatile recording medium. The storage medium stores data that can be read by a microprocessor, and includes a hard disk drive (HDD), a solid state disk (SSD), a silicon disk drive (SDD), a ROM, a RAM, a CD- Tape, floppy disk, optical data storage, and the like.

한편, 메모리(250)에는 음성 인식을 위한 데이터가 저장될 수 있고, 제어부(240)는 오디오 입력부(220)를 통하여 수신되는 사용자의 음성 입력 신호를 처리하고 음성 인식 과정을 수행할 수 있다. Meanwhile, the memory 250 may store data for voice recognition, and the control unit 240 may process the voice input signal of the user received through the audio input unit 220 and perform a voice recognition process.

한편, 간단한 음성 인식은 음성 인식 장치(1)가 수행하고, 자연어 처리 등 고차원의 음성 인식은 음성 인식 서버 시스템(1100)에서 수행될 수 있다. Meanwhile, a simple speech recognition may be performed by the speech recognition apparatus 1, and a high-level speech recognition such as a natural language processing may be performed by the speech recognition server system 1100.

예를 들어, 기설정된 호출어를 포함하는 웨이크 업(wake up) 음성 신호가 수신되는 경우에, 음성 인식 장치(1)는 음성 명령어를 수신하기 위한 상태로 전환될 수 있다. 이 경우에, 음성 인식 장치(1)는 호출어 음성 입력 여부까지의 음성 인식 과정만 수행하고, 이후의 사용자 음성 입력에 대한 음성 인식은 음성 인식 서버 시스템(1100)을 통하여 수행할 수 있다. For example, when a wake up voice signal including a preset caller is received, the voice recognition apparatus 1 can be switched to a state for receiving a voice command. In this case, the speech recognition apparatus 1 performs only a speech recognition process up to whether or not a caller speech is input, and voice recognition for a subsequent user speech input can be performed through the speech recognition server system 1100.

음성 인식 장치(1)의 시스템 자원에는 한계가 있으므로, 복잡한 자연어 인식 및 처리는 음성 인식 서버 시스템(1100)을 통하여 수행될 수 있다. Since the system resources of the speech recognition apparatus 1 are limited, complex natural language recognition and processing can be performed through the speech recognition server system 1100. [

상기 메모리(250)에는 제한적인 데이터가 저장될 수 있다. 예를 들어, 상기 메모리(250)에는 기설정된 호출어를 포함하는 웨이크 업(wake up) 음성 신호를 인식하기 위한 데이터가 저장될 수 있다. 이 경우에, 상기 제어부(240)는 상기 오디오 입력부(220)를 통하여 수신되는 사용자의 음성 입력 신호로부터 기설정된 호출어를 포함하는 웨이크 업(wake up) 음성 신호를 인식할 수 있다. Limited data may be stored in the memory 250. For example, in the memory 250, data for recognizing a wake up voice signal including a predetermined caller may be stored. In this case, the controller 240 can recognize a wake-up voice signal including a preset caller from the voice input signal received through the audio input unit 220.

한편, 상기 호출어는 제조사에 의해 설정될 수 있다. 예를 들어, "LG 허브"가 호출어로 설정될 수 있다. On the other hand, the caller can be set by the manufacturer. For example, the " LG Hub " can be set to an alias.

또한, 상기 호출어는 사용자에 의해 설정 변경이 가능하다. The caller can be changed by the user.

상기 제어부(240)는, 웨이크 업(wake up) 음성 신호의 인식 이후에 입력되는 사용자의 음성 명령을, 통신 모듈(50)을 통하여, 음성 인식 서버 시스템(1100)에 송신하도록 제어할 수 있다. The control unit 240 may control the voice recognition server system 1100 to transmit the voice command of the user inputted after recognizing the wake up voice signal to the voice recognition server system 1100 through the communication module 50. [

통신 모듈(50)은, 다른 전자기기와, 무선 통신을 수행하여, 각종 신호를 주고 받을 수 있다. 예를 들어, 통신 모듈(50)은 스마트 홈 시스템(10) 내/외부의 전자기기들과 통신할 수 있다. The communication module 50 can perform wireless communication with other electronic devices to exchange various signals. For example, the communication module 50 can communicate with electronic devices inside / outside the smart home system 10. [

또한, 통신 모듈(50)은 액세스 포인트 장치(7)와 통신하고, 액세스 포인트 장치(7)를 통하여 무선 인터넷 네트워크에 접속하여 다른 기기들과 통신할 수 있다. In addition, the communication module 50 can communicate with the access point device 7, and can access the wireless Internet network through the access point device 7 to communicate with other devices.

또한, 제어부(240)는 통신 모듈(50)을 통해 음성 인식 장치(1)의 상태 정보, 사용자의 음성 명령 등을 음성 인식 서버 시스템(1100) 등으로 전송할 수 있다. The control unit 240 can also transmit the status information of the voice recognition device 1, the voice command of the user, and the like to the voice recognition server system 1100 or the like through the communication module 50.

한편, 상기 통신 모듈(50)을 통하여 제어 신호가 수신되면, 제어부(240)는 수신되는 제어 신호에 따라 동작하도록 음성 인식 장치(1)를 제어할 수 있다. Meanwhile, when a control signal is received through the communication module 50, the control unit 240 may control the voice recognition device 1 to operate according to the received control signal.

출력부(290)는 사용자의 명령 입력에 대응하는 정보, 사용자의 명령 입력에 대응하는 처리 결과, 동작모드, 동작상태, 에러상태 등을 영상으로 표시하는 디스플레이(13)를 포함할 수 있다. The output unit 290 may include a display 13 for displaying information corresponding to a command input by a user, a processing result corresponding to a command input by the user, an operation mode, an operation state, an error state,

실시예에 따라서는, 상기 디스플레이(13)는 터치패드와 상호 레이어 구조를 이루어 터치스크린으로 구성될 수 있다. 이 경우에, 상기 디스플레이(13)는 출력 장치 이외에 사용자의 터치에 의한 정보의 입력이 가능한 입력 장치로도 사용될 수 있다. According to an embodiment, the display 13 may be configured as a touch screen by forming a mutual layer structure with a touch pad. In this case, the display 13 may be used as an input device capable of inputting information by a user's touch in addition to the output device.

또한, 출력부(290)는 오디오 신호를 출력하는 오디오 출력부(291)를 더 포함할 수 있다. 오디오 출력부(291)는 제어부(240)의 제어에 따라 경고음, 동작모드, 동작상태, 에러상태 등의 알림 메시지, 사용자의 명령 입력에 대응하는 정보, 사용자의 명령 입력에 대응하는 처리 결과 등을 오디오로 출력할 수 있다. 오디오 출력부(291)는, 제어부(240)로부터의 전기 신호를 오디오 신호로 변환하여 출력할 수 있다. 이를 위해, 스피커 등을 구비할 수 있다. The output unit 290 may further include an audio output unit 291 for outputting an audio signal. The audio output unit 291 outputs a notification message such as a warning sound, an operation mode, an operation state, an error state, etc., information corresponding to a command input by the user, a processing result corresponding to a command input by the user, Audio can be output. The audio output unit 291 can convert an electric signal from the control unit 240 into an audio signal and output it. For this purpose, a speaker or the like may be provided.

한편, 음성 인식 장치(1)는 사용자 입력을 위한 조작부(181), 음성 인식 장치(1) 주변 소정 범위를 촬영할 수 있는 카메라(210)를 더 포함할 수 있다. The speech recognition apparatus 1 may further include an operation unit 181 for user input and a camera 210 capable of photographing a predetermined range around the speech recognition apparatus 1. [

조작부(181)는, 복수의 조작 버튼을 구비하여, 입력되는 버튼에 대응하는 신호를 제어부(240)로 전달할 수 있다. The operation unit 181 includes a plurality of operation buttons, and can transmit a signal corresponding to the input button to the control unit 240.

카메라(210)는 음성 인식 장치(1) 주변, 외부 환경 등을 촬영하는 것으로, 이러한 카메라는 촬영 효율을 위해 각 부위별로 여러 개가 설치될 수도 있다. The camera 210 photographs the surroundings of the voice recognition apparatus 1, the external environment, and the like.

예를 들어, 카메라(210)는 적어도 하나의 광학렌즈와, 광학렌즈를 통과한 광에 의해 상이 맺히는 다수개의 광다이오드(photodiode, 예를 들어, pixel)를 포함하여 구성된 이미지센서(예를 들어, CMOS image sensor)와, 광다이오드들로부터 출력된 신호를 바탕으로 영상을 구성하는 디지털 신호 처리기(DSP: Digital Signal Processor)를 포함할 수 있다. 디지털 신호 처리기는 정지영상은 물론이고, 정지영상으로 구성된 프레임들로 이루어진 동영상을 생성하는 것도 가능하다. For example, the camera 210 may include at least one optical lens and an image sensor (e.g., an optical sensor) configured to include a plurality of photodiodes (e.g., pixels) that are imaged by light passing through the optical lens. A CMOS image sensor, and a digital signal processor (DSP) that forms an image based on signals output from the photodiodes. The digital signal processor is capable of generating moving images composed of still frames as well as still images.

한편, 상기 카메라(210)가 촬영하여 획득된 영상은 메모리(250)에 저장될 수 있다.Meanwhile, the image captured by the camera 210 may be stored in the memory 250.

도 15는 본 발명의 일 실시예에 따른 음성 인식 장치의 동작 방법을 도시한 순서도이다.15 is a flowchart illustrating an operation method of a speech recognition apparatus according to an embodiment of the present invention.

도 15를 참조하면, 음성 인식 장치(1)는, 사용자 입력에 의해, 음성 인식 기능을 활성화할 수 있다(S1510). 사용자 입력에 따라 음성 인식 장치(1)의 제어부(240)는, 마이크(221, 222)를 활성화할 수 있다. Referring to Fig. 15, the speech recognition apparatus 1 can activate the speech recognition function by user input (S1510). The control unit 240 of the voice recognition device 1 can activate the microphones 221 and 222 according to user input.

예를 들어, 컨택 스위치들(181a, 181b, 181c, 181d) 중 하나를 작동시키면 음성 인식 기능이 활성화될 수 있다.For example, when one of the contact switches 181a, 181b, 181c, and 181d is activated, the voice recognition function can be activated.

또는, 음성 인식 장치(1)는 음성 인식 기능 활성화에 대한 설정에 따라서 자동으로 음성 인식 기능을 활성화할 수 있다. Alternatively, the speech recognition apparatus 1 can automatically activate the speech recognition function in accordance with the setting for activating the speech recognition function.

예를 들어, 전원이 켜지면, 음성 인식 장치(1)는 자동으로 마이크(221, 222)를 활성화하고, 음성 인식 기능을 활성화할 수 있다.For example, when the power is turned on, the speech recognition apparatus 1 can automatically activate the microphones 221 and 222 and activate the speech recognition function.

본 발명의 일 실시예에 따른 음성 인식 장치(1)는, 음성 인식 과정 및 홈 어플라이언스 제어 과정 중에 발생할 수 있는 다양한 상황들을 고려한 사용자 경험(User Experience, UX)을 제공할 수 있다.The speech recognition apparatus 1 according to an embodiment of the present invention can provide a user experience (UX) considering various situations that may occur during a speech recognition process and a home appliance control process.

제어부(240)는 디스플레이(13)를 제어하여 음성 인식 과정 및 홈 어플라이언스 제어 과정의 각 단계에 대응하여 시각적 정보를 제공할 수 있다.The controller 240 controls the display 13 to provide visual information corresponding to each step of the speech recognition process and the home appliance control process.

또한, 제어부(240)는 오디오 출력부(291)를 제어하여 음성 인식 과정 및 홈 어플라이언스 제어 과정의 각 단계에 대응하여 청각적 정보를 제공할 수 있다.In addition, the controller 240 controls the audio output unit 291 to provide auditory information corresponding to each step of the speech recognition process and the home appliance control process.

디스플레이(13)는 음성 인식 장치(1)의 운전 모드, 현재 상태, 설정 항목들을 여러 가지 시각적 이미지로 표시할 수 있다. 디스플레이(13)는 음성 인식 장치(1)의 운전 모드, 현재 상태, 설정 항목들을 문자, 숫자, 기호로 표시할 수 있고, 아이콘과 같은 그래픽 이미지로 표시할 수 있다. The display 13 can display the operation mode, the current state, and the setting items of the voice recognition device 1 in various visual images. The display 13 can display the operation mode, the current state, and the setting items of the voice recognition device 1 in a character, number, and symbol, and can display a graphic image such as an icon.

또한, 디스플레이(13)는 음성 입력의 처리 과정 별로 대응하는 정보를 표시할 수 있다.Also, the display 13 can display corresponding information for each process of voice input.

도 15를 참조하면, 음성 인식 기능이 활성화되면, 상기 디스플레이(13)는, 마이크 아이콘을 점등하여 표시하고, 제어부(240)는 기설정된 호출어를 포함하는 웨이크 업(wake up) 음성 신호의 입력을 위한 웨이크업 신호 대기 모드로 진입하도록 제어할 수 있다(S1515).Referring to FIG. 15, when the voice recognition function is activated, the display 13 lights and displays a microphone icon, and the control unit 240 displays a wake-up voice signal input (Step S1515). &Lt; / RTI >

한편, 음성 인식 기능이 비활성화되면 상기 마이크 아이콘의 표시가 종료될 수 있다.On the other hand, if the voice recognition function is disabled, the display of the microphone icon may be terminated.

또한, 오디오 출력부(291)는 호출어를 포함하는 발화를 안내하는 음성 안내 메시지를 출력할 수 있다. 예를 들어, 호출어가 "LG 허브"으로 설정되어 있다면, 오디오 출력부(291)는 "LG 허브라고 말하면 음성 인식 기능을 사용할 수 있습니다"와 같은 음성 안내 메시지를 음성 출력할 수 있다.In addition, the audio output unit 291 may output a voice guidance message for guiding the voice including the caller. For example, if the caller is set to " LG Hub ", the audio output unit 291 can output a voice guidance message such as "

한편, 음성 인식 장치(1)의 마이크(221, 222)가 활성화된 상태에서, 사용자로부터 음성 입력이 수신되는 경우, 마이크(221, 222)는, 입력 음성을 수신하고, 이를 제어부(240)로 전달할 수 있다.On the other hand, when a voice input is received from the user while the microphones 221 and 222 of the voice recognition device 1 are activated, the microphones 221 and 222 receive the input voice and transmit it to the control unit 240 .

한편, 마이크(221, 222)를 통해 기설정된 호출어를 포함하는 웨이크 업(wake up) 음성 신호가 수신되는 경우에(S1520), 음성 인식 장치(1)는 음성 명령어를 수신하기 위한 명령어 대기 모드로 전환될 수 있다(S1525). 마이크(221, 222)를 통해 기설정된 호출어를 포함하는 웨이크 업(wake up) 음성 신호가 수신되는 경우에(S1520), 제어부(240)는 명령어 대기 모드로 진입하도록 제어할 수 있다(S1525). On the other hand, when a wake-up voice signal including a predetermined call word is received through the microphones 221 and 222 (S1520), the voice recognition apparatus 1 receives a voice command word for receiving a voice command (S1525). When a wake-up voice signal including a predetermined call word is received through the microphones 221 and 222 in step S1520, the control unit 240 can control the entry into the command standby mode (step S1525) .

음성 인식 장치(1)가 항상 자연어 명령어를 대기하게 되면, 음성인식 기능에서 소비하는 전력과 CPU 점유율 및 음성 인식 서버 시스템의 서버 부하에 부담이 된다.If the voice recognition apparatus 1 always waits for a natural language command, the power consumed in the voice recognition function, the CPU occupation rate, and the load on the server of the voice recognition server system are burdensome.

따라서, 음성 인식 장치(1)는 명령어 대기 상태에서 들어오는 음성 신호에 대해서만 서버로 전송할 수 있다.Therefore, the speech recognition apparatus 1 can transmit only the voice signal incoming from the command standby state to the server.

또한, 음성 인식 장치(1)는 명령어를 대기하는 시간에 조건을 주고 시간 내 명령 입력시 명령어 완료 시점까지는 소리를 받아서 서버로 전달할 수 있다.In addition, the speech recognition apparatus 1 gives a condition at a time of waiting for a command, and when receiving a command within a time, it can receive a sound and deliver it to the server until the completion of the command.

본 발명의 일측에 따르면, 음성 인식 장치(1)는 호출어 음성 입력 여부까지의 음성 인식 과정만 수행하고, 이후의 사용자 음성 입력에 대한 음성 인식은 음성 인식 서버 시스템(1100)을 통하여 수행할 수 있다. According to one aspect of the present invention, the speech recognition apparatus 1 performs only a speech recognition process up to whether a caller speech is input, and the speech recognition for a subsequent user speech input can be performed through the speech recognition server system 1100 have.

또는, 호출어 음성 입력 여부의 판단은 음성 인식 장치(1)와 음성 인식 서버 시스템(1100)에서 이중으로 수행될 수 있다. Alternatively, the determination as to whether or not the caller voice is input may be performed in duplicate in the voice recognition apparatus 1 and the voice recognition server system 1100. [

한편, 상기 호출어는 제조사에 의해 설정될 수 있고, 음성 인식 장치 별로 다른 호출어가 설정될 수 있다. 예를 들어, "LG 허브"로 설정될 수 있다. 또한, 상기 호출어는 사용자에 의해 설정 변경이 가능하다. Meanwhile, the caller may be set by a manufacturer, and another caller may be set for each voice recognition device. For example, it can be set to "LG Hub". The caller can be changed by the user.

한편, 상기 제어부(240)는, 웨이크 업(wake up) 음성 신호의 인식 이후에 입력되는 사용자의 음성 명령을, 통신 모듈(50)을 통하여, 음성 인식 서버 시스템(1100)에 송신하도록 제어할 수 있다(S1530). Meanwhile, the control unit 240 may control the voice recognition server system 1100 to transmit a user's voice command, which is input after recognizing the wake up voice signal, through the communication module 50 (S1530).

한편, 사용자의 음성 명령을 무기한으로 기다리는 것은 시스템 자원이 낭비되고, 명령어를 입력하지 않는 사용자의 의도에도 부합하지 않으므로 명령어 대기 모드에서는 소정 경과 시간 동안에 입력되는 음성 명령만 처리하도록 설정될 수 있다(S1527).On the other hand, waiting indefinitely for a user's voice command is wasteful of system resources and does not match the intention of a user who does not input a command. Therefore, in the command standby mode, only voice commands input during a predetermined elapsed time can be set to be processed ).

이 경우에, 상기 제어부(240)는, 상기 기설정된 경과 시간 이내에 수신된 음성 명령을 포함하는 음성 데이터를 음성 서버(1110)로 전송하도록 제어할 수 있다(S1530). In this case, the control unit 240 may control the voice server 1110 to transmit voice data including voice commands received within the predetermined time elapsed time (S1530).

한편, 상기 기설정된 경과 시간 이내에 상기 음성 명령이 수신되지 않으면(S1527), 상기 제어부(240)는, 다시 상기 웨이크업 신호 대기 모드로 전환하도록 제어할 수 있다.On the other hand, if the voice command is not received within the preset elapsed time (S1527), the controller 240 can control to switch to the wake-up signal standby mode again.

실시예에 따라서는, 제1 시간 이내에 상기 통신 모듈(50)이 상기 음성 서버(1110)로부터 상기 음성 명령에 기초한 응답 신호를 수신하지 못하면, 상기 오디오 출력부(291)는 기다려달라는 음성 안내 메시지를 출력할 수 있다. 이 경우에, 상기 제1 시간 이후 제2 시간 이내에도 상기 통신 모듈(50)이 상기 음성 명령에 기초한 응답 신호를 수신하지 못하면, 상기 오디오 출력부(291)는 음성 명령의 재입력을 요청하는 음성 안내 메시지를 출력할 수 있다.According to an embodiment, if the communication module 50 does not receive a response signal based on the voice command from the voice server 1110 within the first time, the audio output unit 291 sends a voice guidance message to wait Can be output. In this case, if the communication module 50 does not receive a response signal based on the voice command even within the second time after the first time, the audio output unit 291 outputs a voice request for re- A guidance message can be output.

상기 제어부(240)는, 상기 제2 시간 이내에도 상기 통신 모듈(50)이 상기 음성 명령에 기초한 응답 신호를 수신하지 못하면, 상기 명령어 대기 모드로 전환하도록 제어할 수 있다.If the communication module 50 does not receive the response signal based on the voice command even within the second time, the control unit 240 can control to switch to the command standby mode.

본 발명의 일 실시예에 따른 음성 인식 장치(1)는, 출력부(290)의 동작 뿐만 아니라, 다른 방식으로도 사용자에게 소정 정보를 제공할 수 있다.The speech recognition apparatus 1 according to the embodiment of the present invention can provide not only the operation of the output unit 290 but also predetermined information to the user in other ways as well.

예를 들어, 본 발명의 일 실시예에 따른 음성 인식 장치(1)는 기설정된 경과 시간 이내에 음성 명령이 수신되면, 명령어를 수신하였음을 사용자에게 알리기 위하여, 명령어 수신에 대응하는 피드백 동작을 수행할 수 있다.For example, when a voice command is received within a preset elapsed time, the voice recognition apparatus 1 according to an embodiment of the present invention performs a feedback operation corresponding to reception of a command to inform the user that the command has been received .

실시예에 따라서는, 상기 기설정된 경과 시간 이내에 상기 음성 명령이 수신되면(S1527), 음성 인식 장치(1)가 명령어를 수신하였음을 사용자에게 알리기 위하여, 제어부(240)는 명령어 수신에 대응하는 피드백 동작을 수행하도록 제어할 수 있다.According to an embodiment of the present invention, when the voice command is received within the preset elapsed time (S1527), the control unit 240 notifies the user that the voice recognition apparatus 1 has received the command, So as to perform an operation.

이 경우에, 상기 제어부(240)의 제어에 따라, 기설정된 동작을 수행할 수 있다. 예를 들어, 소정 사운드를 출력할 수 있다. In this case, the controller 240 can perform the predetermined operation under the control of the controller 240. For example, it is possible to output a predetermined sound.

또한, 사용자가 음성 인식 장치(1)에 음성 명령을 입력할 때에는 대부분의 경우가 음성 인식 장치(1)를 동작시키려는 의도이므로, 선제적으로 소정 사운드를 출력하고, 이후에 판별된 음성 명령에 대응하여 동작함으로써 사용자의 음성 명령에 더 빠르게 대응할 수 있다.Further, when the user inputs a voice command to the voice recognition apparatus 1, since most of the cases are intended to operate the voice recognition apparatus 1, a predetermined sound is output in advance and a voice command corresponding to the voice command So that the user can respond to the voice command more quickly.

한편, 음성 서버(1110) 및 이를 포함하는 음성 인식 서버 시스템(1100)은 도 12 내지 도 14를 참조하여 설명한 것과 같이 음성 인식 장치(1)로부터 수신되는 음성 명령을 포함하는 음성 데이터를 인식하고 처리할 수 있다.12 to 14, the voice server 1110 and the voice recognition server system 1100 including the voice server 1110 recognize and process voice data including voice commands received from the voice recognition device 1 can do.

이에 따라, 통신 모듈(50)이 가전 제어 서버(1130)로부터 상기 음성 명령에 기초한 제어 신호를 수신하면(S1540), 제어부(240)는 수신한 제어 신호에 대응하여 동작하도록 음성 인식 장치(1)를 제어할 수 있다(S1545).Accordingly, when the communication module 50 receives the control signal based on the voice command from the home appliance control server 1130 (S1540), the control unit 240 controls the voice recognition device 1 to operate in accordance with the received control signal. (S1545).

또한, 통신 모듈(50)이 상기 음성 서버(1110)로부터 상기 음성 명령에 기초한 응답 신호를 수신하면(S1550), 제어부(240)는 수신한 응답 신호에 대응하는 음성 안내 메시지를 출력하도록 오디오 출력부(291)를 제어할 수 있다(S1570).When the communication module 50 receives a response signal based on the voice command from the voice server 1110 (S1550), the control unit 240 controls the audio output unit 1110 to output a voice guidance message corresponding to the received response signal (Step S1570).

실시예에 따라서는, 상기 응답 신호가 음성 데이터를 포함하지 않는 경우(S1560), 제어부(240)는 상기 음성 서버(1110)로 음성 데이터를 요청하여(S1565), 상기 음성 서버(1110)로부터 요청한 음성 데이터를 수신하도록 제어할 수 있다.The control unit 240 requests the voice server 1110 to send voice data in step S1565 and transmits the voice data requested by the voice server 1110 to the voice server 1110 in step S1560. It is possible to control to receive voice data.

한편, 음성 인식 결과에 따라서, 상기 통신 모듈(50)을 통하여 상기 음성 서버(1110)로부터 음성 인식 실패에 대응하는 신호를 수신할 수 있다. 이 경우에, 제어부(240)는 음성 명령의 재입력을 요청하는 음성 안내 메시지를 출력하도록 오디오 출력부(291)를 제어하고, 다시 상기 명령어 대기 모드로 전환하도록 제어할 수 있다.On the other hand, according to the result of speech recognition, a signal corresponding to the speech recognition failure can be received from the speech server 1110 through the communication module 50. [ In this case, the control unit 240 may control the audio output unit 291 to output the voice guidance message requesting re-input of the voice command, and control to switch to the command standby mode again.

또한, 판별된 음성 명령의 지원 가능 여부 판별에 따라서, 상기 통신 모듈(50)을 통하여 상기 음성 서버(1110)로부터 상기 음성 명령이 지원되지 않는 기능에 관한 것임을 알리는 신호를 수신할 수 있다(S1535). 이 경우에, 제어부(240)는 상기 음성 명령이 지원되지 않는 기능에 관한 것임을 알리는 음성 안내 메시지를 출력하도록 오디오 출력부(291)를 제어할 수 있다(S1537).In accordance with the discrimination of supportability of the discriminated voice command, a signal indicating that the voice command is not supported from the voice server 1110 through the communication module 50 can be received (S1535) . In this case, the control unit 240 may control the audio output unit 291 to output a voice guidance message indicating that the voice command is related to the unsupported function (S1537).

이 경우에, 제어부(240)는 웨이크업 신호 대기 모드로 전환하도록 제어할 수 있다.In this case, the control unit 240 can control to switch to the wake-up signal standby mode.

또는, 제어부(240)는 다시 상기 명령어 대기 모드로 전환하도록 제어할 수 있다.Alternatively, the control unit 240 may control to switch to the command standby mode again.

본 발명에 따르면 각 상황에 맞는 시각적 정보 및/또는 음성 안내 메시지를 제공함으로써, 음성 제어 명령 시 어느 단계에서 어떤 문제로 정상제어 되지 않는지 정확하게 파악할 수 있다는 장점이 있다. 또한, 제품을 제어할 수 있는 명령어의 자유도가 높아 사용성이 개선된다.According to the present invention, visual information and / or voice guidance messages suitable for each situation are provided, so that it is possible to accurately grasp at which stage a voice control command is not normally controlled. In addition, the degree of freedom of commands for controlling the product is high, and usability is improved.

도 16은 본 발명의 일 실시예에 따른 음성 인식 장치의 동작 방법을 도시한 순서도이다.16 is a flowchart illustrating an operation method of a speech recognition apparatus according to an embodiment of the present invention.

도 16을 참조하면, 음성 인식 장치(1)는 마이크(221, 222)를 통하여 사용자의 음성 입력 신호를 수신할 수 있다(S1610).Referring to FIG. 16, the speech recognition apparatus 1 can receive a user's voice input signal through the microphones 221 and 222 (S1610).

이 경우에, 수신되는 음성 입력 신호는, 음성 인식 장치(1)의 음성 처리를 위하여 기설정된 호출어를 포함하거나, 호출어 인식 후의 명령어 대기 모드에서 입력되는 사용자의 음성 명령을 포함할 수 있다.In this case, the received voice input signal may include a predetermined voice call for voice processing of the voice recognition device 1, or may include a voice command of the user input in the voice standby mode after recognizing the caller.

오디오 입력부(220)는, 아날로그 음성을 디지털 데이터로 변환하는 처리부를 포함하거나 처리부에 연결되어 사용자 입력 음성을 제어부(240) 및 음성 인식 서버 시스템(1110)에서 인식할 수 있도록 데이터화 할 수 있다.The audio input unit 220 may include a processing unit for converting the analog voice into digital data or may be connected to the processing unit to convert the user input voice into data to be recognized by the control unit 240 and the voice recognition server system 1110.

한편, 오디오 입력부(220)는 마이크(221, 222)에서 수신되는 오디오 신호에서 노이즈를 제거하는 필터, 필터에서 출력되는 신호를 증폭하여 출력하는 증폭기 등 오디오 신호 처리를 위한 구성들을 포함할 수 있다. Meanwhile, the audio input unit 220 may include a filter for removing noise from an audio signal received from the microphones 221 and 222, and an amplifier for amplifying and outputting a signal output from the filter.

음성 인식 장치(1)의 통신 모듈(50)은 상기 음성 입력 신호 대응하는 음성 데이터를 음성 인식 서버 시스템(1110)으로 전송할 수 있다.The communication module 50 of the voice recognition apparatus 1 may transmit the voice data corresponding to the voice input signal to the voice recognition server system 1110. [

한편, 음성 인식 장치(1)는, 음성 데이터의 전송(S1620)과 별개로, 상기 음성 입력 신호의 주파수 및 세기에 기초하여 사용자를 식별할 수 있다(S1630).On the other hand, the speech recognition apparatus 1 can identify the user based on the frequency and the intensity of the speech input signal, separately from the transmission of speech data (S1620) (S1630).

사람의 음성은 성별 및 연령별로 주파수(frequency)가 달라질 수 있다. 또한, 개인별로 음성의 주파수 및 세기가 다를 수 있다.The voice of a person can be varied in frequency by sex and age. Also, the frequency and intensity of voice may be different for each individual.

예를 들어, 남성의 경우, 20대가 되면서 음성의 기본 주파수가 10대보다 낮아져 40대까지 안정된 상태가 유지될 수 있다. 여성의 경우, 10대에서 30대까지 연령 증가에 따라 기본 주파수의 점진적 감소를 보이다가 50대 이후 두드러진 감소를 나타낼 수 있다. 또한, 생애 2년 이하의 유아는 300~500Hz의 높은 기본 주파수를 가지며, 3세부터 14세 이하의 유아, 어린이는 250~300Hz의 기본 주파수를 가질 수 있다.For example, in the case of men, the basic frequency of voice becomes lower than 10 in 20s, and the stable state can be maintained until 40s. In the case of females, a gradual decrease in the fundamental frequency is shown with increasing age from the teenage to the thirties, but it can be markedly decreased after the fifties. Infants under two years of age have a high fundamental frequency of 300 to 500 Hz. Infants and children from 3 to 14 years of age can have a fundamental frequency of 250 to 300 Hz.

따라서, 음성 인식 장치(1)는, 사용자 음성으로부터 주파수, 세기 정보를 추출하여, 사용자 인식 및 사용자의 음색 데이터로 이용할 수 있다.Therefore, the speech recognition apparatus 1 can extract frequency and intensity information from the user's voice, and can use it as user recognition and user's tone color data.

이를 위해, 음성 인식 장치(1)는, 사전에 사용자의 음성 신호를 수집하여, 수집된 음성 신호의 주파수 및 세기를 분석하여, 사용자의 음색을 판별하고, 사용자 별로 음성 신호 및 분석 데이터를 메모리(250)에 저장할 수 있다.To this end, the speech recognition apparatus 1 collects the user's speech signal in advance, analyzes frequency and intensity of the collected speech signal, discriminates the user's tone color, 250).

즉, 본 발명의 일 실시예에 따른 음성 인식 장치(1)에는, 수집되는 음성 신호에 기초하는 사용자별 음색 데이터베이스가 구축될 수 있다.That is, in the speech recognition apparatus 1 according to the embodiment of the present invention, a user-specific tone color database based on the collected speech signals can be constructed.

상기 데이터베이스에는 상기 음성 데이터에 포함되는 자연어 음원 파일들이 저장될 수 있다.The database may store natural language sound files included in the voice data.

또한, 상기 데이터베이스에는 음성 데이터들을 하나 이상의 구분인자에 따라 분류하여 저장될 수 있다.In addition, the database may store voice data classified according to one or more classification factors.

예를 들어, 상기 데이터베이스에는 사용자 개인별로 음성 데이터를 분류하여 저장할 수 있고, 또한, 음성 신호의 주파수 및 세기 분석에 기초하여 분석된 음색 데이터가 저장될 수 있다.For example, the database may classify and store voice data for individual users, and the analyzed voice data may be stored based on frequency and intensity analysis of the voice signal.

상기 음색 데이터는, 상기 식별된 사용자의 음성의 주파수 및 세기 데이터를 포함할 수 있다.The tone color data may include frequency and intensity data of the identified user's voice.

또한, 상기 음색 데이터는, 상기 식별된 사용자의 음성의 음도(pitch), 진동 변화율(jitter), 파형 규칙성(shimmer) 데이터 중 적어도 하나를 더 포함할 수 있다.In addition, the tone color data may further include at least one of pitch, vibration rate, and shimmer data of the voice of the identified user.

음성 인식 장치(1)는 수신하는 음성 입력 신호에 대하여 딥러닝(Deep Learning) 등 머신 러닝(machine learning)을 수행할 수 있고, 메모리(250)는, 머신 러닝에 사용되는 데이터, 결과 데이터 등을 저장할 수 있다.The speech recognition apparatus 1 may perform machine learning such as deep learning on the received speech input signal and the memory 250 may store data used for machine learning, Can be stored.

머신 러닝(Machine Learning)의 일종인 딥러닝(Deep Learning) 기술은 데이터를 기반으로 다단계로 깊은 수준까지 내려가 학습하는 것이다.Deep Learning, a type of machine learning, is a multi-level, deep learning process based on data.

딥러닝(Deep learning)은 단계를 높여갈수록 복수의 데이터들로부터 핵심적인 데이터를 추출하는 머신 러닝(Machine Learning) 알고리즘의 집합을 나타낼 수 있다. Deep learning can represent a set of machine learning algorithms that extract key data from multiple sets of data as they step up.

딥러닝 구조는 인공신경망(ANN)를 포함할 수 있으며, 예를 들어 딥러닝 구조는 CNN(Convolutional Neural Network), RNN(Recurrent Neural Network), DBN(Deep Belief Network) 등 심층신경망(DNN)으로 구성될 수 있다.The deep learning structure may include an artificial neural network (ANN). For example, the deep learning structure may include a deep neural network (DNN) such as CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), DBN .

본 발명에 따른 딥러닝 구조는 공지된 다양한 구조를 이용할 수 있다. 예를 들어, 본 발명에 따른 딥러닝 구조는 CNN(Convolutional Neural Network), RNN(Recurrent Neural Network), DBN(Deep Belief Network) 등일 수 있다.The deep running structure according to the present invention can utilize various known structures. For example, the deep learning structure according to the present invention may be a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN), a Deep Belief Network (DBN), or the like.

RNN(Recurrent Neural Network)은, 자연어 처리 등에 많이 이용되고 있으며, 시간의 흐름에 따라 변하는 시계열 데이터(Time-series data) 처리에 효과적인 구조로 매 순간마다 레이어를 쌓아올려 인공신경망 구조를 구성할 수 있다.Recurrent Neural Network (RNN) is widely used in natural language processing, etc., and it is possible to construct an artificial neural network structure by stacking layers every moment with an effective structure for time-series data processing that varies with time .

DBN(Deep Belief Network)은 딥러닝 기법인 RBM(Restricted Boltzman Machine)을 다층으로 쌓아 구성되는 딥러닝 구조이다. RBM(Restricted Boltzman Machine) 학습을 반복하여, 일정 수의 레이어가 되면 해당 개수의 레이어를 가지는 DBN(Deep Belief Network)를 구성할 수 있다. Deep Belief Network (DBN) is a deep-run structure consisting of multiple layers of deep-running RBM (Restricted Boltzman Machine). The Restricted Boltzman Machine (RBM) learning is repeated, and a DBN (Deep Belief Network) having a corresponding number of layers can be constituted by a certain number of layers.

CNN(Convolutional Neural Network)은 사람이 물체를 인식할 때 물체의 기본적인 특징들을 추출한 다음 뇌 속에서 복잡한 계산을 거쳐 그 결과를 기반으로 물체를 인식한다는 가정을 기반으로 만들어진 사람의 뇌 기능을 모사한 모델이다. CNN (Convolutional Neural Network) is a model that simulates a person's brain function based on the assumption that when a person recognizes an object, it extracts the basic features of the object, then undergoes complicated calculations in the brain and recognizes the object based on the result to be.

한편, 인공신경망의 학습은 주어진 입력에 대하여 원하는 출력이 나오도록 노드간 연결선의 웨이트(weight)를 조정(필요한 경우 바이어스(bias) 값도 조정)함으로써 이루어질 수 있다. 또한, 인공신경망은 학습에 의해 웨이트(weight) 값을 지속적으로 업데이트시킬 수 있다. 또한, 인공신경망의 학습에는 역전파(Back Propagation) 등의 방법이 사용될 수 있다.On the other hand, the artificial neural network learning can be done by adjusting the weight of the inter-node interconnections (adjusting the bias value if necessary) so that the desired output is obtained for a given input. Also, the artificial neural network can continuously update the weight value by learning. Back propagation can be used for artificial neural network learning.

한편, 음성 인식 장치(1)에는 인공신경망(Artificial Neural Network)이 탑재될 수 있고, .수신되는 음성 입력 신호를 입력 데이터로 하는 머신 러닝(machine learning) 기반의 사용자 인식 및 사용자의 음색 인식을 수행할 수 있다. On the other hand, an artificial neural network may be installed in the speech recognition apparatus 1, and a machine learning based user recognition in which a received speech input signal is used as input data, can do.

제어부(240)는 인공신경망, 예를 들어, CNN(Convolutional Neural Network), RNN(Recurrent Neural Network), DBN(Deep Belief Network) 등 심층신경망(Deep Neural Network: DNN)을 포함될 수 있고, 심층신경망을 학습할 수 있다.The control unit 240 may include an artificial neural network such as a CNN (Convolutional Neural Network), an RNN (Recurrent Neural Network), a DBN (Deep Belief Network), and a Deep Neural Network You can learn.

상기 인공신경망의 머신 러닝 방법으로는 자율학습(unsupervised learning)과 지도학습(supervised learning)이 모두 사용될 수 있다.As the machine learning method of the artificial neural network, both unsupervised learning and supervised learning can be used.

상기 제어부(240)는 설정에 따라 학습 후 음색 인식 인공신경망 구조를 업데이트시키도록 제어할 수 있다.The control unit 240 may control to update the post-learning tone recognition artificial neural network structure according to the setting.

한편, 음성 인식 장치(1)는 사용자가 발화한 음성 입력 신호를 수신하여 수신한 음성 입력 신호에 기초한 음성 데이터를 음성 서버(1110)로 송신할 수 있고, 음성 서버(1110)는 음성 인식 장치(1)로부터 음성 데이터를 수신할 수 있다(S1620).On the other hand, the voice recognition apparatus 1 can receive voice input signals uttered by the user and transmit the voice data based on the received voice input signals to the voice server 1110, 1) (S1620).

한편, 음성 인식 서버 시스템(1110)은 수신한 음성 데이터를 분석하여 사용자의 음성 입력을 인식할 수 있다.Meanwhile, the voice recognition server system 1110 can recognize the voice input of the user by analyzing the received voice data.

음성 인식 서버 시스템(1110)의 음성 서버(1110)는, 상기 음성 데이터를 텍스트(text) 데이터로 변환할 수 있다. 이러한 음성 데이터의 텍스트 데이터 변환은 자동 음성 인식 서버(1111)에서 수행될 수 있다.The voice server 1110 of the voice recognition server system 1110 may convert the voice data into text data. The text data conversion of the voice data can be performed in the automatic voice recognition server 1111.

이후, 상기 음성 서버(1110)는 상기 텍스트 데이터를 분석하여 상기 음성 데이터에 포함된 사용자 명령을 인식할 수 있다. 상기 음성 서버(1110)의 자연어 처리 서버(1112)는 자연어 처리 알고리즘을 제1 텍스트 데이터에 대한 자연어 처리를 수행하여 사용자의 의도에 부합하는 사용자 명령을 판별할 수 있다.The voice server 1110 may analyze the text data and recognize a user command included in the voice data. The natural language processing server 1112 of the voice server 1110 can perform the natural language processing on the first text data by using the natural language processing algorithm to discriminate the user command corresponding to the user's intention.

음성 인식 서버 시스템(1110)은, 음성 인식 장치(1)로 상기 음성 입력 신호에 기초한 응답 신호를 전송할 수 있다. 이에 따라, 음성 인식 장치(1)는 상기 음성 인식 서버 시스템(1110)으로부터 상기 음성 입력 신호에 기초한 응답 신호를 수신할 수 있다(S1640).The voice recognition server system 1110 can transmit a response signal based on the voice input signal to the voice recognition device 1. [ Accordingly, the speech recognition apparatus 1 can receive a response signal based on the speech input signal from the speech recognition server system 1110 (S1640).

한편, 상기 응답 신호는 상기 음성 입력 신호에 대한 음성 인식 결과에 대응하는 텍스트 데이터를 포함할 수 있고, 제어부(240)는, 상기 텍스트 데이터를 상기 식별된 사용자에 대응하여 데이터베이스에 저장된 음색 데이터에 매핑(mapping)할 수 있다.The response signal may include text data corresponding to the speech recognition result of the speech input signal, and the control unit 240 may map the text data to the tone color data stored in the database corresponding to the identified user (not shown).

음성 인식 장치(1)는 오디오 출력부(291)를 통하여 상기 수신한 응답 신호에 대응하는 음성 안내 메시지를 출력할 수 있다(S1650).The voice recognition apparatus 1 can output a voice guidance message corresponding to the received response signal through the audio output unit 291 (S1650).

제어부(240)는, 상기 식별된 사용자에 대응하여 데이터베이스에 저장된 음색 데이터에 기초하는 음성으로 상기 음성 안내 메시지를 출력하도록 제어할 수 있다.The control unit 240 may control to output the voice guidance message with voice based on the tone color data stored in the database corresponding to the identified user.

현재 많은 음성 인식 장치들은 기계식 모노(mono) 톤으로 음성 인식 장치(1)는 텍스트 음성 변환(Text to Speech: TTS)된 음성 안내 메시지를 출력하고 있다.Currently, many speech recognition devices are in a mechanical mono tone, and the speech recognition device 1 outputs a text-to-speech (TTS) voice guidance message.

이에 따라, 사용자는 기계음에 거부감을 느낄 수 있고, 대화하고 있다는 느낌을 받기 어려웠다.As a result, the user can feel a sense of rejection in the machine sound, and it is difficult to feel that the user is conversing.

하지만, 음성 인식 장치(1)는 마이크(221, 222)를 통하여 입력된 사용자의 음색을 고음&저음(주파수) 및 세기 분석하고, 분석된 데이터를 재가공할 수 있다.However, the voice recognition device 1 may analyze the tone of the user inputted through the microphones 221 and 222, treble and bass (frequency) and intensity, and re-process the analyzed data.

음성 인식 장치(1)는 화자인식 기술을 이용하여 발화자의 데이터베이스로 음성 주파수(저,고주파)와 세기 정보를 STT(Speech To Text) 데이터 등 기저장된 데이터와 매핑할 수 있다. The speech recognition apparatus 1 can use a speaker recognition technology to map a speech frequency (low frequency and high frequency) and intensity information to stored data such as STT (Speech To Text) data in a database of a speaker.

실시예에 따라서는 매핑 데이터는 딥 러닝 실행과 빅 데이터(big data)를 저장할 수 있는 음성 인식 서버 시스템(1110)으로 전달될 수 있다. 이 경우에, 음성 인식 장치(1)는 음성 인식 서버 시스템(1110)으로부터 사용자의 음색 데이터를 수신하거나, 사용자의 음색 데이터가 적용된 음원 파일을 수신할 수 있고, 이에 기초한 음성 안내 메시지를 출력할 수 있다.Depending on the embodiment, the mapping data may be passed to the speech recognition server system 1110, which may store deep running and store big data. In this case, the speech recognition apparatus 1 can receive the tone color data of the user from the speech recognition server system 1110, receive the sound source file to which the user's tone color data is applied, and output the voice guidance message based thereon have.

본 발명의 일 실시예에 다른 음성 인식 장치(1)는, 기저장된 음색 데이터를 적용하여, 딱딱한 기계음을 넘어서 사용자의 말투, 음색과 유사한 음성 안내 메시지를 출력할 수 있다.The speech recognition apparatus 1 according to the embodiment of the present invention can output the voice guidance message similar to the tone and tone of the user beyond the rigid mechanical tone by applying the previously stored tone color data.

즉, 본 발명은 화자의 음성 데이터를 음성 인식, 사용자 인식에만 사용하지 않고, 출력으로도 재활용함으로써, 사용자에게 익숙한 맞춤형 음성 안내를 제공할 수 있다.In other words, the present invention can provide customized voice guidance familiar to the user by reusing speech data of the speaker not only for voice recognition but also for user recognition.

또한, 사용자가 음성인식 기능을 많이 쓸수록 데이터가 많이 축적되고 축적된 데이터로 음성 인식 장치(1) 또는 음성 인식 서버 시스템(1100)은 딥 러닝을 이용하여 성능을 지속적으로 개선할 수 있다.Further, as the user uses the voice recognition function more frequently, the voice recognition apparatus 1 or the voice recognition server system 1100 can continuously improve the performance by using deep learning as the accumulated data and accumulated data.

본 발명의 일 실시예에 따른 음성 인식 장치(1)는, 다른 주파수 음역대를 출력하는 2이상의 스피커(43, 44)를 구비할 수 있다.The speech recognition apparatus 1 according to the embodiment of the present invention may include two or more speakers 43 and 44 for outputting different frequency ranges.

예를 들어, 상측에 배치되는 스피커(43)는 고음 대역을 출력하는 트위터(tweeter)이고, 하측에 배치되는 스피커(44)는 저음 대역을 출력하는 우퍼(woofer)일 수 있다.For example, the speaker 43 disposed on the upper side may be a tweeter for outputting a high frequency band, and the speaker 44 disposed on the lower side may be a woofer for outputting a low frequency band.

제어부(240)는, 상기 음색 데이터에 기초하여, 출력 음성을 고음 대역과 저음 대역으로 구분하여 아날로그 인코딩하고, 음성 인식 장치(1)가 구비하는 고음 대역 스피커(43)를 통하여 고음 대역 음성을 출력하며, 저음 대역 스피커(44)를 통하여 저음 대역 음성을 출력하도록 제어할 수 있다.Based on the tone color data, the control unit 240 classifies the output speech into a high-frequency band and a low-frequency band, analog-encodes it, and outputs a high-frequency band sound through the high- And controls the low-band speaker 44 to output low-band speech.

따라서, 본 발명의 일 실시예에 따른 음성 인식 장치(1)는, 딥러닝과 빅데이터 기반으로 축적된 사용자의 음색 데이터(주파수 및 세기)를 이용하여, 다시 2개의 스피커(43, 44)를 통해 최대한 사용자의 음색(억양 및 어조)과 가까운 TTS 음을 만들어 내 제공할 수 있다.Therefore, the speech recognition apparatus 1 according to the embodiment of the present invention can reproduce two speakers 43 and 44 again by using the deep learning and the tone data (frequency and intensity) of the user accumulated on the basis of the big data (Tone and tone) of the user as much as possible.

또한, 본 발명의 일 실시예에 따른 음성 인식 장치(1)는, 사용자의 지방 사투리를 반영하거나, 다른 사용자와 차별화된 사용자 개인의 TTS 음성 데이터를 제공할 수 있다. In addition, the speech recognition apparatus 1 according to the embodiment of the present invention can reflect the local dialect of the user or provide the individual TTS voice data differentiated from other users.

이에 따라 사용자는 자신만의 TTS 음성 데이터를 가지거나 공유할 수 있다. 도한, 사용자의 음색이 반영된 데이터는 다른 기기 또는 다른 사람에게 보내져 활용될 수도 있다.Accordingly, the user can own or share his own TTS voice data. Also, the data reflecting the tone of the user may be sent to another device or other person for utilization.

도 17은 본 발명의 일 실시예에 따른 음성 인식 장치의 동작 방법을 도시한 순서도이다.17 is a flowchart illustrating an operation method of a speech recognition apparatus according to an embodiment of the present invention.

도 17을 참조하면, 음성 인식 장치(1)는 마이크(221, 222)를 통하여 사용자의 음성 입력 신호를 수신할 수 있다(S1710).Referring to FIG. 17, the speech recognition apparatus 1 can receive a user's voice input signal through the microphones 221 and 222 (S1710).

또는, 수신되는 음성 입력 신호는, 사용자의 음성 명령과 무관하게 생활 중에 발화한 사용자의 음성을 포함할 수 있다.Alternatively, the received voice input signal may include voice of a user who has uttered a voice in the course of life, regardless of the voice command of the user.

오디오 입력부(220)는 마이크(221, 222)를 통하여 사용자의 음성을 입력받아, 입력된 음성을 데이터화할 수 있다.The audio input unit 220 receives the user's voice through the microphones 221 and 222 and converts the input voice into data.

제어부(240)는, 화자 인식 기술에 따라 음성 데이터에서 주파수와 세기 데이터를 추출하여 화자를 구분할 수 있다(S1720).The control unit 240 may extract the frequency and intensity data from the voice data according to the speaker recognition technique to distinguish the speaker (S1720).

음성 인식 장치(1)는 상기 음성 입력 신호의 주파수 및 세기에 기초하여 사용자를 식별할 수 있고(S1720), 상기 식별된 사용자에 대응하여 데이터베이스에 저장된 음색 데이터에 기초하는 음성 안내 메시지를 출력할 수 있다(S1730).The speech recognition apparatus 1 can identify the user based on the frequency and the intensity of the speech input signal (S1720), and can output a speech announcement message based on the tone color data stored in the database corresponding to the identified user (S1730).

예를 들어, 사용자가 집에 귀가하여 말하는 음성이 수신되면, 음성 인식 장치(1)는 사용자의 음색을 반영한 친근한 음성으로 인사말을 발화할 수 있다.For example, when the user receives a voice saying home speech, the speech recognition apparatus 1 can speak a greeting with a familiar voice reflecting the tone of the user.

또한, 음성 인식 장치(1)는 음악 재생, 공기조화기 등의 제어를 추천 안내할 수도 있다.Further, the speech recognition apparatus 1 may recommend guidance of music reproduction, control of an air conditioner, and the like.

본 실시예에서도, 제어부(240)는, 상기 음색 데이터에 기초하여, 출력 음성을 고음 대역과 저음 대역으로 구분하여 아날로그 인코딩하고, 음성 인식 장치(1)가 구비하는 고음 대역 스피커(43)를 통하여 고음 대역 음성을 출력하며, 저음 대역 스피커(44)를 통하여 저음 대역 음성을 출력하도록 제어할 수 있다.In this embodiment, the control unit 240 also classifies the output speech into a high-frequency band and a low-frequency band on the basis of the tone color data, analog-encodes the output speech, It is possible to output a high-frequency band voice and to output a low-frequency band voice through the low-frequency band speaker 44. [

실시예에 따라서는, 음성 인식 장치(1)는 자체적으로 음성 인식 과정을 수행할 수도 있다.In some embodiments, the speech recognition apparatus 1 may perform a speech recognition process on its own.

이 경우에, 제어부(240)는, 상기 음성 입력 신호에 포함된 음성 명령을 판별할 수 있고, 상기 판별된 음성 명령에 대응하는 텍스트 데이터를 상기 식별된 사용자에 대응하여 메모리(250) 또는 음성 인식 서버 시스템(1100)의 데이터베이스에 저장된 음색 데이터에 매핑(mapping)한 후에, 사용자의 음색이 적용된 음성 안내 메시지를 발화하도록 제어할 수 있다.In this case, the control unit 240 can discriminate a voice command included in the voice input signal, and transmits text data corresponding to the determined voice command to the memory 250 or voice recognition After mapping the voice data stored in the database of the server system 1100 to voice data, the voice guidance message to which the user's voice is applied can be controlled to be generated.

제어부(240)는, 음성 데이터와 데이터베이스에 저장된 음색 데이터에 매핑(mapping)하고, 음성 응답 시 필요한 문자에 대한 음을 데이터베이스에서 가져와, 가져온 데이터에서 고음, 저음에 대하여 아날로그(ANALOG) 인코딩을 수행할 수 있다.The control unit 240 maps the voice data to the tone color data stored in the database, fetches the voice for the character required for voice response from the database, and performs analog (ANALOG) encoding on the high and low tones from the imported data .

제어부(240)는, 고음은 트위터 스피커(43)를 통해 재생하고, 저음은 우퍼 스피커(44)를 통하여 재생하도록 제어함으로써, REAL TTS 음성 서비스 제공이 가능하다.The control unit 240 can provide the REAL TTS voice service by controlling the treble to be reproduced through the tweeter speaker 43 and the low tone to be reproduced through the woofer speaker 44. [

도 18은 본 발명의 일 실시예에 따른 음성 인식 장치의 동작 방법을 도시한 순서도이다. 도 18은 음성 인식과 병행 및 독립적으로 사용자 음성을 수집하여 데이터베이스화하는 과정을 포함한다.18 is a flowchart illustrating an operation method of a speech recognition apparatus according to an embodiment of the present invention. FIG. 18 includes a process of collecting user's voice in parallel with and independent of speech recognition and databaseing.

도 18을 참조하면, 사용자의 설정 또는 동의에 의해, 음성 인식 장치(1)는 음성 수집 모드로 진입할 수 있다(S1810). 여기서, 음성 수집 모드는 음성 데이터를 수집하여 텍스트 데이터화하거나, 수집된 음성의 주파수, 세기 등을 분석한 텍스트 데이터를 생성하므로, STT(Speech To Text) 데이터 수집 모드 또는 Only STT DATA 수집 모드로도 명명될 수 있다.Referring to FIG. 18, the voice recognition apparatus 1 can enter the voice acquisition mode by setting or agreement of the user (S1810). Here, the voice acquisition mode generates text data by collecting voice data and analyzing the frequency and intensity of the collected voice. Therefore, the voice acquisition mode is also referred to as STT (Speech To Text) data acquisition mode or Only STT DATA acquisition mode. .

이러한, 음성 수집 모드는 음성 인식 과정과 별개로 수행될 수 있고, 이에 따라 빠른 음성 데이터베이스 구축이 가능하다.Such a voice acquisition mode can be performed separately from the voice recognition process, and thus a fast voice database can be constructed.

한편, 음성 인식 과정과 독립적으로 음성을 수집하기 위해, 마이크(221, 222)는 음성 인식 과정과 무관하게 활성화될 수 있다(S1815).Meanwhile, in order to collect voice independently of the voice recognition process, the microphones 221 and 222 may be activated independently of the voice recognition process (S1815).

즉, 음성 입력으로 음성 인식 장치(1)를 호출하거나, 소정 버튼을 조작하지 않아도, 마이크(221, 222)는 음성인식 대기 상태(mic always on 상태)로 동작할 수 있다.That is, the microphones 221 and 222 can operate in the voice recognition standby state (mic always on state) without the voice recognition device 1 being called or the predetermined button being operated.

음성 인식 장치(1)는 항상 켜진 마이크(221, 222)를 이용하여 음성 데이터를 수집할 수 있다. 이에 따라, 음성 인식 장치(1)는 설치된 공간에서 가족 구성원의 모든 대화를 듣고 수집할 수 있다.The voice recognition device 1 can collect voice data using the microphones 221 and 222 that are always turned on. Accordingly, the speech recognition device 1 can listen to and collect all the conversations of the family members in the installed space.

한편, 음성 인식 장치(1)는 수집된 음성에 대하여 화자 인식 기술 소프트웨어 처리를 수행하고(S1820), 사람의 목소리를 구분하여(S1825), 데이터베이스에 저장할 수 있다(S1840).On the other hand, the speech recognition device 1 performs speaker recognition technology software processing on the collected voice (S1820), divides the voice of the person (S1825), and stores the voice in the database (S1840).

이 경우에, 음성 인식 장치(1)는 사용자의 음성의 주파수 및 세기 데이터와 기존에 수집, 분석된 STT 데이터를 매핑하여(S1830), 데이터베이스에 저장할 수 있다(S1840).In this case, the speech recognition apparatus 1 may map the frequency and intensity data of the user's voice to STT data that has been collected and analyzed (S1830), and may store the STT data in the database (S1840).

한편, 데이터베이스에는 사용자가 자주 말한 데이터만 저장되거나, 사용자가 사용한 어휘의 빈도수에 기초하여 가중치를 적용한 데이터가 저장될 수 있다.On the other hand, in the database, only data frequently referred to by the user can be stored, or the weighted data can be stored based on the frequency of the vocabulary used by the user.

또한, 데이터베이스에는 빈도수가 높은 단어 위주로 외부 망에서 해당 단어와 관련된 데이터를 수신하여 추가적으로 저장할 수도 있다.Also, the database may receive additional data related to the word in the external network based on the word having a high frequency.

특히, 사용자가 음성 인식 기능 시 사용한 단어가 평상 시 빈번하게 사용한 단어로 데이터베이스에 많은 정보가 저장이 되어 있다면, 사용자에게 특화된 음성 안내를 제공하고, 사용자 기호가 반영된 음성 안내 서비스를 제공할 수 있다.Particularly, if a word used in a voice recognition function by a user is a frequently used word in a database and a lot of information is stored in the database, it is possible to provide a voice guidance specialized for the user and provide a voice guidance service reflecting the user's preference.

예를 들어, 데이터베이스에 스포츠에 대한 음성이 많았다면 음성 인식 장치(1)는 스포츠에 특화된 음성 인식 장치(1)로 발전할 수 있다.For example, if the database contains a lot of voice for sports, the voice recognition device 1 can develop into a voice recognition device 1 specialized for sports.

예를 들어, 사용자가 '야구' 관련 단어를 많이 발화하면, 사용자가 자주 언급한 '야구' 관련 데이터를 음성 인식 서버 시스템(1100), 기타 서버로부터 수신할 수 있고(S1850), 추가로 확보된 데이터를 이용하여, 사용자에게 '야구'와 관련된 상세 정보를 제공할 수 있다(S1860).For example, when a user utteres a lot of words related to 'baseball', the baseball-related data frequently referred to by the user can be received from the voice recognition server system 1100 and other servers (S1850) Using the data, detailed information related to 'baseball' may be provided to the user (S1860).

이 경우에, 제어부(240)는, 제공할 정보에 대한 음성을 고음은 트위터 스피커(43), 저음은 우퍼 스피커(44)로 인코딩하고(S1870), 출력하도록 제어함으로써, REAL TTS 음성 서비스 제공이 가능하다(S1880).In this case, the control unit 240 encodes (S1870) and outputs the voice of the information to be provided to the tweeter speaker 43 and the low sound to the woofer speaker 44, thereby providing the REAL TTS voice service provision (S1880).

한편, 음성 인식 장치(1) 또는 음성 인식 서버 시스템(1100)은 동작하지 않는 시간을 딥러닝(Deep learning)을 수행하는 시간으로 사용하여, 끊임없이 성능을 개선할 수 있다.On the other hand, the speech recognition apparatus 1 or the speech recognition server system 1100 can continuously improve the performance by using the non-operating time as a time for performing deep learning.

도 19는 일 실시예에 따른 음성 인식 장치의 동작 방법에 관한 설명에 참조되는 도면이다.FIG. 19 is a diagram for explaining an operation method of a speech recognition apparatus according to an embodiment.

본 발명의 실시예에 따른 데이터베이스는, 음성 인식 장치(1) 또는 음성 인식 서버 시스템(1100)의 저장매체에 저장될 수 있다.The database according to the embodiment of the present invention may be stored in the storage medium of the speech recognition apparatus 1 or the speech recognition server system 1100. [

도 19를 참조하면, 데이터베이스에는 음성 인식 장치(1)가 배치되어 사용되는 생활 공간의 구성원들로부터 수집된 데이터가 저장될 수 있다. 예를 들어, 데이터베이스에는 가족 구성원 A, B, C, D의 데이터가 저장될 수 있다.Referring to FIG. 19, data collected from the members of the living space in which the voice recognition apparatus 1 is placed and used may be stored in the database. For example, the database may store data for family members A, B, C, and D.

음성 인식 장치(1) 또는 기타 전자기기는 데이터베이스에 저장된 사람 A, B, C, D 중 필요한 사람의 데이터를 불러와 이용할 수 있다.The voice recognition device 1 or other electronic device can use the necessary person's data among the persons A, B, C and D stored in the database.

예를 들어, 음성 인식 장치(1)는 수신되는 음성에 대해 화자 인식하여, 현재 A와 C가 소정 공간에 있음을 인지할 수 있다.For example, the speech recognition apparatus 1 recognizes a speaker on a received speech, and recognizes that the present A and C are in a predetermined space.

이 경우에, 음성 인식 장치(1)는 A와 C에 대한 데이터를 데이터베이스에서 불러올 수 있다.In this case, the speech recognition apparatus 1 can retrieve the data for A and C from the database.

이후, 사람 C가 "오늘 날씨 알려줘"라는 음성 명령을 입력하면, 음성 인식 장치(1)는 자체적으로 자연어 음성 인식 과정을 수행하거나 음성 인식 서버 시스템(1100)으로 음성 데이터를 전송하고, 그에 대한 음성 인식 결과를 수신할 수 있다.Thereafter, when the person C inputs a voice command " Tell me the weather today ", the voice recognition apparatus 1 itself performs a voice recognition process by itself or transmits voice data to the voice recognition server system 1100, It is possible to receive the recognition result.

음성 명령에 대한 인식 또는 인식 결과의 수신 후에, 음성 인식 장치(1)는 음성 인식 결과에 대응하여, 사람 C의 음성과 어투와 유사한 음색이 적용된 음성 안내 메시지를 출력할 수 있다.After receiving the recognition or recognition result of the voice command, the voice recognition device 1 can output the voice guidance message in which the voice of the person C and the voice tone similar to the voice tone are applied, in response to the voice recognition result.

본 실시예에서, 음성 인식 장치(1)는 사람 C의 음성과 어투와 유사한 음색으로 오늘의 날씨를 안내할 수 있다.In the present embodiment, the speech recognition apparatus 1 can guide today's weather with a tone similar to that of human C with tone.

또한, 공기조화기, 세탁기, 냉장고, 이동단말기 등 기타 전자기기도 데이터베이스를 이용할 수 있다. Also, a database can be used for other electronic devices such as an air conditioner, a washing machine, a refrigerator, and a mobile terminal.

예를 들어, 냉장고는, 수신되는 음성에 대해 화자 인식하여, 현재 B와 D가 소정 공간에 있음을 인지할 수 있다.For example, the refrigerator can recognize the speaker as to the received voice, and recognize that the present B and D are present in the predetermined space.

이 경우에, 냉장고는 B와 D에 대한 데이터를 데이터베이스에서 불러올 수 있다.In this case, the refrigerator can retrieve data for B and D from the database.

이후, 사람 B가 "오늘 일정 알려줘"라는 음성 명령을 입력하면, 냉장고는 자체적으로 자연어 음성 인식 과정을 수행하거나 음성 인식 서버 시스템(1100)으로 음성 데이터를 전송하고, 그에 대한 음성 인식 결과를 수신할 수 있다.Thereafter, when the person B inputs a voice command " Tell me a schedule today ", the refrigerator itself performs a natural speech recognition process or transmits voice data to the voice recognition server system 1100 and receives a voice recognition result .

음성 명령에 대한 인식 또는 인식 결과의 수신 후에, 냉장고는 음성 인식 결과에 대응하여, 사람 B의 음성과 어투와 유사한 음색이 적용된 음성 안내 메시지를 출력할 수 있다.After receiving the recognition or recognition result of the voice command, the refrigerator can output the voice guidance message to which the voice of the person B and the tone similar to the tone are applied, corresponding to the voice recognition result.

본 실시예에서, 냉장고는 사람 B의 음성과 어투와 유사한 음색으로 오늘의 일정을 안내할 수 있다.In this embodiment, the refrigerator can guide today's schedule with a tone similar to that of human B's tone.

본 발명에 따른 음성 서버, 음성 인식 서버 시스템 및 음성 인식 장치는 상기한 바와 같이 설명된 실시예들의 구성과 방법이 한정되게 적용될 수 있는 것이 아니라, 상기 실시예들은 다양한 변형이 이루어질 수 있도록 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성될 수도 있다.The voice server, the voice recognition server system, and the voice recognition apparatus according to the present invention are not limited to the configuration and method of the embodiments described above, but the embodiments may be modified in various ways, All or a part of the above-described elements may be selectively combined.

한편, 본 발명의 실시예에 따른 음성 인식 방법, 음성 서버, 음성 인식 서버 시스템 및 음성 인식 장치의 동작 방법은, 프로세서가 읽을 수 있는 기록매체에 프로세서가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 프로세서가 읽을 수 있는 기록매체는 프로세서에 의해 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 프로세서가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있으며, 또한, 인터넷을 통한 전송 등과 같은 캐리어 웨이브의 형태로 구현되는 것도 포함한다. 또한, 프로세서가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 프로세서가 읽을 수 있는 코드가 저장되고 실행될 수 있다.Meanwhile, the speech recognition method, the voice server, the speech recognition server system, and the method of operating the speech recognition apparatus according to the embodiments of the present invention can be implemented as a code that can be read by a processor on a processor-readable recording medium. The processor-readable recording medium includes all kinds of recording apparatuses in which data that can be read by the processor is stored. Examples of the recording medium that can be read by the processor include a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like, and may also be implemented in the form of a carrier wave such as transmission over the Internet . In addition, the processor-readable recording medium may be distributed over network-connected computer systems so that code readable by the processor in a distributed fashion can be stored and executed.

또한, 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어서는 안될 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, It should be understood that various modifications may be made by those skilled in the art without departing from the spirit and scope of the present invention.

음성 인식 장치: 1
음성 인식 서버 시스템: 1100
음성 서버: 1110
ASR 서버: 1111
NLP 서버: 1112
TTS 서버: 1113
연계 서비스 서버: 1120
가전 제어 서버: 1130 Speech Recognition Device: 1
Voice recognition server system: 1100
Voice server: 1110
ASR server: 1111
NLP server: 1112
TTS server: 1113
Linked service server: 1120
Home appliance control server: 1130

Claims

Receiving a voice input signal of a user through a microphone;
Transmitting voice data corresponding to the voice input signal to a voice recognition server system;
Identifying a user based on frequency and intensity of the speech input signal;
Receiving a response signal based on the speech input signal from the speech recognition server system; And
And outputting a voice guidance message corresponding to the received response signal,
Wherein the voice guidance message output step outputs the voice guidance message with voice based on the tone color data stored in the database corresponding to the identified user.

The method according to claim 1,
The voice announcement message output step may include:
The output voice is divided into a high-frequency band and a low-frequency band and is analog-encoded based on the tone data, a high-frequency band sound is output through a high-frequency band speaker provided in the speech recognition device, and a low- Wherein the speech recognition apparatus comprises:

The method according to claim 1,
Wherein the tone color data includes frequency and intensity data of the identified user's voice.

The method of claim 3,
Wherein the tone color data further includes at least one of a pitch, a jitter, and shimmer data.

The method according to claim 1,
Wherein the response signal includes text data corresponding to a speech recognition result for the speech input signal.

6. The method of claim 5,
Mapping the text data to tone color data stored in a database corresponding to the identified user. &Lt; RTI ID = 0.0 > 31. < / RTI >

Receiving a voice input signal of a user through a microphone;
Identifying a user based on frequency and intensity of the speech input signal; And
And outputting a voice guidance message based on the tone color data stored in the database in correspondence with the identified user.

8. The method of claim 7,
The voice announcement message output step may include:
The output voice is divided into a high-frequency band and a low-frequency band and is analog-encoded based on the tone data, a high-frequency band sound is output through a high-frequency band speaker provided in the speech recognition device, and a low- Wherein the speech recognition apparatus comprises:

8. The method of claim 7,
Wherein the tone color data includes frequency and intensity data of the identified user's voice.

8. The method of claim 7,
And discriminating a voice command included in the voice input signal.

11. The method of claim 10,
And mapping text data corresponding to the identified voice command to tone color data stored in a database corresponding to the identified user.