KR20190036832A

KR20190036832A - Voice Receiving Terminal and Internet Of Things Network System And Method Including The Same

Info

Publication number: KR20190036832A
Application number: KR1020170126215A
Authority: KR
Inventors: 조영빈
Original assignee: 주식회사 케이티
Priority date: 2017-09-28
Filing date: 2017-09-28
Publication date: 2019-04-05

Abstract

The present invention provides a voice receiving terminal for increasing a voice recognition rate of a voice recognition control server. The voice receiving terminal comprises: a voice receiving unit including a plurality of microphones receiving sound source signals; a beam forming unit sequentially forming beams for emphasizing the sound source signals in a plurality of directions; a control unit comparing each size of the sound source signals emphasized by the beam forming unit to fix the beams in a sound source direction among the plurality of directions in which the largest sound source signal is outputted and determine whether the sound signals outputted from the sound source direction is a voice command signal of a user; and a signal converting unit converting the voice command signal into an electrical signal when the sound source signal outputted from the sound source direction is determined as the voice command signal.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice receiving terminal,

본 발명은 음성 수신 단말과 이를 포함하는 사물 인터넷 네트워크 시스템 및 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice receiving terminal and a object Internet network system and method including the same.

인터넷은 사람이 정보의 생산자 및 소비자로서 정보를 공유할 수 있는 공간으로 활용되어 왔다. 향후, 가전 기기 및 센서 등 우리 주변의 사물까지도 네트워크에 연결되어 사물 주변의 환경 정보 및 사물 자체의 정보도 공유될 수 있는 사물 인터넷(Internet of Things; IoT) 시대가 도래할 것으로 예측된다. 즉, 사물 인터넷을 지원하는 디바이스가 향후 급속히 증가할 것으로 전망되고 있다.The Internet has been used as a space where people can share information as producers and consumers of information. In the future, it is expected that the Internet of Things (IoT) era, in which environmental information around objects and information about objects themselves can be shared, is also expected to be connected to the network even to objects around us, such as home appliances and sensors. In other words, devices supporting the Internet of things are expected to increase rapidly in the future.

이와 같은 사물 인터넷을 통해 사람과 사람, 사람과 사물, 사물과 사물 간의 통신, 상호 작용, 정보 공유가 가능해지면, 사물 스스로 판단하는 지능형 서비스가 가능해지고, 기업은 비용절감, 나아가 녹색 성장을 위한 그린(Green) 사물 인터넷을 지원할 수 있는 인프라를 구축할 수 있다.When such objects can communicate, interact, and share information between people and people, people and objects, objects and objects via the Internet, intelligent services that judge things by themselves become possible, and corporations are able to reduce costs, (Green) Things You can build an infrastructure to support the Internet.

최근, 음성 인식 장치를 이용해 화자의 음성을 인식하여 사물 인터넷 장치를 제어하는 사물 인터넷 네트워크 방법에 관한 연구가 활발히 진행되고 있다.In recent years, there has been actively studied a method of object Internet network in which a speech recognition apparatus is used to recognize the speech of a speaker and control the object Internet apparatus.

이러한 음성 인식 장치는 일반적으로 음성을 취득하기 위한 수단으로서 마이크로폰(microphone)을 사용하고 있다.Such a voice recognition apparatus generally uses a microphone as a means for acquiring voice.

한편, 마이크로폰을 이용해 음원 신호를 입력 받는 환경은 주변 간섭음이 없이 조용한 환경이기보다는 다양한 소음과 주변 간섭음이 모두 포함되어 있는 환경일 경우가 많다.On the other hand, the environment in which a sound source signal is input using a microphone is often an environment in which various noises and surrounding interference sounds are included rather than a quiet environment without surrounding interference.

이와 같은 환경에서 화자의 음성만을 획득하기 위해서는 지향성(directivity) 이 좋은 마이크로폰을 사용하거나 마이크로폰과 화자 간의 거리를 가깝게 유지시켜야 한다. In such an environment, it is necessary to use a microphone having good directivity or to keep the distance between the microphone and the speaker close to obtain the speaker's voice.

그러나, 지향성이 좋은 마이크로폰을 사용하는 경우 비용이 상승되고, 복수의 사물 인터넷 장치를 공간적으로 분리 배치한 경우 마이크로폰과 화자 간의 거리를 가깝게 유지시키기 어려워, 하나의 음성 인식 장치를 통해 각 공간마다 분리 배치된 복수의 사물 인터넷 장치를 제어하기 위해서는 제약이 따른다.However, in the case of using a microphone having good directivity, the cost is increased and it is difficult to keep the distance between the microphone and the speaker close to each other when a plurality of object Internet devices are spatially separated and arranged. There is a limitation in controlling a plurality of Internet devices.

이와 같이, 마이크로폰과 화자 사이의 거리가 멀어지면 화자의 음성뿐만 아니라 주변 잡음이나 잔향음이 마이크로폰으로 입력될 가능성이 높아지므로 낮은 잡음 대 신호비(Signal-to-Noise Ratio; SNR)를 가지게 된다. 즉, 음성 인식 장치의 음성 인식률이 저하된다.Thus, if the distance between the microphone and the speaker is increased, the probability that the speaker's voice as well as the ambient noise or the reverberation will be input to the microphone becomes higher, thereby having a low signal-to-noise ratio (SNR). That is, the speech recognition rate of the speech recognition apparatus is lowered.

본 발명은 음성 수신 단말을 이용하여 사용자의 음성 명령을 음성 수신 단말 및 사용자와 다른 공간에 배치된 음성 인식 제어 서버로 전달함으로써, 음성 인식 제어 서버의 음성 인식률을 향상시키는 것을 목적으로 한다.An object of the present invention is to improve the voice recognition rate of a voice recognition control server by transmitting a voice command of a user to a voice recognition control server arranged in a space different from a voice receiving terminal and a user using a voice receiving terminal.

또한, 본 발명은 음성 수신 단말이 음성 명령 신호를 해석하기 위한 별도의 구성을 구비하지 않음으로써, 소비 전력 및 생산 원가를 절감할 수 있고, 이를 통해 음성 수신 단말을 소비자(사용자)에게 저렴한 가격으로 공급하는 것을 목적으로 한다.Further, since the present invention does not have a separate structure for interpreting a voice command signal, it is possible to reduce power consumption and production cost, thereby enabling a voice receiving terminal to be provided to a consumer And to supply them.

상술한 바와 같은 과제를 해결하기 위하여 본 발명은, 음원 신호를 수신하는 복수의 마이크로폰을 포함하는 음원 수신부와, 음원 신호를 강조하기 위한 빔을 복수의 방향에 대해 순차적으로 각각 형성하는 빔포밍부와, 빔포밍부에 의해 강조된 복수의 음원 신호의 크기를 각각 비교하여, 복수의 방향 중 가장 큰 음원 신호가 출력되는 음원 방향에 빔을 고정하고, 음원 방향에서 출력되는 음원 신호가 사용자의 음성 명령 신호인지 판단하는 제어부와, 음원 방향에서 출력되는 음원 신호가 음성 명령 신호로 판단되면 음성 명령 신호를 전기적 신호로 변환하는 신호 변환부를 포함하는 음성 수신 단말을 제공한다.According to an aspect of the present invention, there is provided a sound source apparatus including a sound source receiving unit including a plurality of microphones for receiving sound source signals, a beam forming unit for sequentially forming beams for emphasizing sound source signals in a plurality of directions, , The sizes of the plurality of sound source signals emphasized by the beamforming unit are respectively compared to fix the beam in the direction of the sound source from which the largest sound source signal is output in a plurality of directions, And a signal conversion unit for converting the voice command signal into an electrical signal when the sound source signal output from the sound source direction is determined as a voice command signal.

여기서, 음원 수신부는, 복수의 마이크로폰 중 어느 하나를 통해 음원 신호의 감지 동작을 수행하다가, 음원 신호가 감지되면, 복수의 마이크로폰을 동작시켜 음원 신호를 수신할 수 있다.Here, the sound source receiving unit performs a sound source signal sensing operation through any one of the plurality of microphones, and when a sound source signal is sensed, it can receive a sound source signal by operating a plurality of microphones.

또한, 복수의 방향은 제1 내지 제4 방향을 포함하고, 빔은 일정 간격으로 형성될 수 있다.Further, the plurality of directions include the first to fourth directions, and the beams may be formed at regular intervals.

또한, 음원 수신부는 음원 신호가 기준값 이상이면, 음원 신호를 수신할 수 있다.Further, the sound source receiving unit can receive the sound source signal when the sound source signal is equal to or greater than the reference value.

또한, 전기적 신호를 출력하는 출력부를 더 포함할 수 있다.The apparatus may further include an output unit that outputs an electrical signal.

또한, 상용 전원을 공급하기 위한 콘센트에 결합되는 플러그로 이루어지거나, 상기 플러그 및 상기 플러그에 결합되는 USB 소켓으로 이루어지는 전원 연결부를 더 포함할 수 있다.The apparatus may further include a power connection unit including a plug coupled to an outlet for supplying commercial power, or a USB socket coupled to the plug and the plug.

또한, 음원 신호를 수신하는 복수의 마이크로폰을 포함하는 음원 수신부와, 음원 신호를 강조하기 위한 빔을 복수의 방향에 대해 순차적으로 각각 형성하는 빔포밍부와, 빔포밍부에 의해 강조된 복수의 음원 신호의 크기를 각각 비교하여, 복수의 방향 중 가장 큰 음원 신호가 출력되는 음원 방향에 빔을 고정하고, 음원 방향에서 출력되는 음원 신호가 사용자의 음성 명령 신호인지 판단하는 제1 제어부와, 음원 방향에서 출력되는 음원 신호가 음성 명령 신호로 판단되면 음성 명령 신호를 전기적 신호로 변환하는 신호 변환부와, 전기적 신호를 출력하는 출력부를 포함하는 음성 수신 단말과, 전기적 신호를 제공 받아 전기적 신호에 대응되는 음성 제어 정보를 생성하는 음성 인식부와, 음성 제어 정보에 따라 사물 인터넷 장치를 제어하는 제2 제어부를 포함하는 음성 인식 제어 서버를 포함하는 사물 인터넷 네트워크 시스템을 제공한다.A sound source receiving section including a plurality of microphones for receiving a sound source signal; a beam forming section for sequentially forming a beam for emphasizing a sound source signal in a plurality of directions; and a plurality of sound source signals A first controller for comparing the magnitudes of the plurality of directions and fixing the beam in the direction of the sound source from which the largest sound source signal is output and determining that the sound source signal output in the direction of the sound source is a voice command signal of the user, A sound receiving terminal including a signal converting unit for converting a voice command signal into an electric signal when the sound source signal to be output is a voice command signal and an output unit for outputting an electric signal; A voice recognition unit for generating control information, and a second control unit for controlling the object Internet apparatus in accordance with the voice control information And a speech recognition control server including the speech recognition control server.

여기서, 음성 인식 제어 서버는 제1 공간에 배치되고, 음성 수신 단말은 제1 공간과 다른 제2 공간에 배치될 수 있다.Here, the voice recognition control server may be disposed in the first space, and the voice receiving terminal may be disposed in the second space different from the first space.

또한, 음성 인식 제어 서버는 제2 공간에 위치한 사물 인터넷 장치를 제어할 수 있다.Also, the voice recognition control server can control the object Internet apparatus located in the second space.

또한, 음원 수신부는 음원 신호를 기초로 사용자가 제2 공간 내부에 존재하는지 판단할 수 있다.Also, the sound source receiving unit can determine whether the user exists in the second space based on the sound source signal.

또한, 제1 제어부는 음원 방향의 변화 또는 음원 방향에서의 음원 신호의 크기 변화량을 통해 사용자가 제2 공간으로 들어오고 있는지 또는 사용자가 제2 공간에서 나가고 있는지 판단할 수 있다.The first control unit may determine whether the user is entering the second space or the user is leaving the second space through the change in the direction of the sound source or the magnitude of the magnitude of the sound source signal in the sound source direction.

또한, 제2 제어부는 사용자가 제2 공간으로 들어오고 있는지 또는 사용자가 제2 공간에서 나가고 있는지에 따라 사물 인터넷 장치를 제어할 수 있다.Also, the second control unit can control the things Internet device according to whether the user is entering the second space or the user is leaving the second space.

또한, 음성 수신 단말은 음성 인식 제어 서버에 등록됨으로써 음성 인식 제어 서버와 네트워크 연결될 수 있다.Further, the voice receiving terminal can be network-connected to the voice recognition control server by registering with the voice recognition control server.

또한, 음성 수신 단말은 복수의 제2 공간에 각각 배치되고, 음성 인식부는 복수의 음성 수신 단말로부터 각각 제공 받은 전기적 신호의 레벨을 기초로 사용자의 위치를 판단할 수 있다.The voice receiving terminals are respectively disposed in the plurality of second spaces, and the voice recognizing section can determine the position of the user based on the levels of the electrical signals provided from the plurality of voice receiving terminals.

또한, 음성 인식 제어 서버는 음성 명령 신호를 수신하여 이를 전기적 신호로 변환하고, 음성 인식부는 복수의 음성 수신 단말로부터 각각 제공 받은 전기적 신호와 음성 인식 제어 서버가 변환한 전기적 신호의 레벨을 기초로 사용자의 위치를 판단할 수 있다.The voice recognition control server receives the voice command signal and converts the voice command signal into an electrical signal. The voice recognition unit recognizes the voice signal from the voice recognition server based on the level of the electrical signal supplied from each of the plurality of voice receiving terminals and the electrical signal converted by the voice recognition control server, Can be determined.

또한, 음성 인식부는 복수의 전기적 신호 중 레벨이 가장 큰 전기적 신호에 대응되는 음성 제어 정보를 생성할 수 있다.In addition, the speech recognition unit can generate the voice control information corresponding to the electrical signal having the highest level among the plurality of electrical signals.

또한, 복수의 마이크로폰 중 어느 하나를 통해 음원 신호의 감지 동작을 수행하는 단계와, 음원 신호가 감지되면, 복수의 마이크로폰을 동작시키는 단계와, 복수의 마이크로폰을 통해 음원 신호를 수신하는 단계와, 음원 신호를 강조하기 위한 빔을 복수의 방향에 대해 순차적으로 각각 형성하는 단계와, 복수의 음원 신호의 크기를 각각 비교하여, 복수의 방향 중 가장 큰 음원 신호가 출력되는 음원 방향에 빔을 고정하는 단계와, 음원 방향에서 출력되는 음원 신호가 사용자의 음성 명령 신호인지 판단하는 단계와, 음원 방향에서 출력되는 음원 신호가 음성 명령 신호로 판단되면 음성 명령 신호를 전기적 신호로 변환하는 단계를 포함하는 사물 인터넷 네트워크 방법을 제공한다.The method includes the steps of: sensing a sound source signal through one of a plurality of microphones; operating a plurality of microphones when the sound source signal is detected; receiving a sound source signal through a plurality of microphones; A step of sequentially forming a beam for emphasizing a signal in a plurality of directions, a step of comparing the sizes of the plurality of sound source signals and fixing the beam in a sound source direction in which the largest sound source signal among the plurality of directions is outputted The method includes the steps of: determining whether a sound source signal output from a sound source direction is a voice command signal of a user, and converting a voice command signal into an electrical signal when a sound source signal output in a sound source direction is a voice command signal; Network method.

여기서, 음원 신호가 음성 명령 신호로 판단되지 않으면, 음원 신호의 감지 동작을 수행하는 단계로 되돌아가는 단계를 더 포함할 수 있다.Here, if the sound source signal is not determined as a voice command signal, the method may further include a step of returning to a step of performing a sound source signal sensing operation.

또한, 전기적 신호를 제공 받아 전기적 신호에 대응되는 음성 제어 정보를 생성하는 단계와, 음성 제어 정보에 따라 사물 인터넷 장치를 제어하는 단계를 더 포함할 수 있다.The method may further include generating voice control information corresponding to the electrical signal by receiving the electrical signal, and controlling the object Internet apparatus according to the voice control information.

본 발명에 따르면, 음성 수신 단말을 이용하여 사용자의 음성 명령을 음성 수신 단말 및 사용자와 다른 공간에 배치된 음성 인식 제어 서버로 전달함으로써, 음성 인식 제어 서버의 음성 인식률을 향상시킬 수 있는 효과가 있다.According to the present invention, there is an effect that the voice recognition rate of the voice recognition control server can be improved by transmitting the voice command of the user by using the voice receiving terminal to the voice recognition control server arranged in the space other than the voice receiving terminal and the user .

또한, 본 발명에 따르면, 음성 수신 단말이 음성 명령 신호를 해석하기 위한 별도의 구성을 구비하지 않음으로써, 소비 전력 및 생산 원가를 절감할 수 있고, 이를 통해 음성 수신 단말을 소비자(사용자)에게 저렴한 가격으로 공급할 수 있는 효과가 있다.In addition, according to the present invention, since the voice receiving terminal does not have a separate structure for interpreting the voice command signal, it is possible to reduce the power consumption and the production cost, thereby enabling the voice receiving terminal to be made cheap It is effective to supply with price.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects obtained by the present invention are not limited to the above-mentioned effects, and other effects not mentioned can be clearly understood by those skilled in the art from the following description will be.

도 1은 본 발명의 실시예에 따른 사물 인터넷 네트워크 시스템의 개략적인 블록도이다.
도 2는 본 발명의 실시예에 따른 음성 수신 단말의 구체적인 블록도이다.
도 3은 도 2의 음원 수신부가 형성하는 빔 패턴을 설명하기 위한 그래프이다.
도 4는 도 2의 음원 수신부가 복수의 방향에 대해 순차적으로 형성하는 빔을 설명하기 위한 도면이다.
도 5는 본 발명의 실시예에 따른 음성 수신 단말의 예시적인 사시도이다.
도 6은 본 발명의 실시예에 따른 음성 인식 제어 서버의 구체적인 블록도이다.
도 7은 본 발명의 실시예에 따른 사물 인터넷 네트워크 방법의 흐름도이다.1 is a schematic block diagram of an object Internet network system according to an embodiment of the present invention.
2 is a specific block diagram of a voice receiving terminal according to an embodiment of the present invention.
3 is a graph for explaining a beam pattern formed by the sound source receiving unit of FIG.
FIG. 4 is a view for explaining a beam sequentially formed by the sound source receiving unit of FIG. 2 in a plurality of directions. FIG.
5 is an exemplary perspective view of a voice receiving terminal according to an embodiment of the present invention.
6 is a detailed block diagram of a speech recognition control server according to an embodiment of the present invention.
FIG. 7 is a flowchart of a method for controlling an object Internet network according to an embodiment of the present invention.

이하, 본 발명에 따른 바람직한 실시 형태를 첨부된 도면을 참조하여 상세하게 설명한다. 첨부된 도면과 함께 이하에 개시될 상세한 설명은 본 발명의 예시적인 실시형태를 설명하고자 하는 것이며, 본 발명이 실시될 수 있는 유일한 실시형태를 나타내고자 하는 것이 아니다. 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략할 수 있고, 명세서 전체를 통하여 동일 또는 유사한 구성 요소에 대해서는 동일한 참조 부호를 사용할 수 있다.Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The following detailed description, together with the accompanying drawings, is intended to illustrate exemplary embodiments of the invention and is not intended to represent the only embodiments in which the invention may be practiced. In order to clearly illustrate the present invention in the drawings, parts not related to the description may be omitted, and the same reference numerals may be used for the same or similar components throughout the specification.

본 발명의 일 실시 예에서, “또는”, “적어도 하나” 등의 표현은 함께 나열된 단어들 중 하나를 나타내거나, 또는 둘 이상의 조합을 나타낼 수 있다. 예를 들어, “A 또는 B”, “A 및 B 중 적어도 하나”는 A 또는 B 중 하나만을 포함할 수 있고, A와 B를 모두 포함할 수도 있다.In one embodiment of the present invention, the expressions " or ", " at least one ", etc. may denote one of the words listed together or may represent a combination of two or more. For example, " A or B ", " at least one of A and B " may include only one of A or B, and may include both A and B.

사물 인터넷(Internet of Things; IoT)이란, 사람, 사물 및 서비스의 세 가지 분산된 환경 요소에 대해 상호 협력적으로 센싱, 네트워크 및 정보 처리 등 지능적 관계를 형성하는 사물 공간 연결 네트워크를 의미한다.Internet of Things (IoT) is a space-connected network that forms intelligent relationships such as sensing, networking, and information processing in cooperation with three distributed environmental factors: people, objects, and services.

이러한, 사물 인터넷의 주요 구성 요소인 사물은 유무선 네트워크에서의 통신 장비뿐만 아니라, 사람, 차량, 교량, 각종 전자장비, 문화재 및 자연 환경을 구성하는 물리적 사물 등을 포함한다. These objects, which are the main components of the Internet, include not only communication equipment in a wired / wireless network, but also physical objects constituting people, vehicles, bridges, various electronic equipments, cultural properties and natural environment.

또한, 사물 인터넷은 네트워크를 이용하여 사람과 사물, 사물과 사물 간 지능 통신을 할 수 있는 M2M(Machine to Machine)의 개념을 인터넷으로 확장하여 사물뿐 아니라, 현실세계 및 가상세계의 모든 정보와 상호 작용한다.In addition, the Internet of Things uses the network to extend the concept of M2M (machine to machine), which enables intelligent communication between people, things, objects and objects, to the Internet, .

도 1은 본 발명의 실시예에 따른 사물 인터넷 네트워크 시스템의 개략적인 블록도이다.1 is a schematic block diagram of an object Internet network system according to an embodiment of the present invention.

도 1에 도시한 바와 같이, 본 발명의 실시예에 따른 사물 인터넷 네트워크 시스템은 음성 수신 단말(100), 음성 인식 제어 서버(200) 및 사물 인터넷 장치(300)를 포함할 수 있다.1, the object Internet network system according to the embodiment of the present invention may include a voice receiving terminal 100, a voice recognition control server 200, and an object Internet apparatus 300. [

여기서, 음성 수신 단말(100), 음성 인식 제어 서버(200) 및 사물 인터넷 장치(300)는 유선 또는 무선 통신 네트워크로 연결되어 있다.Here, the voice receiving terminal 100, the voice recognition control server 200, and the object Internet apparatus 300 are connected by a wired or wireless communication network.

또한, 음성 수신 단말(100)은, 음성 인식 제어 서버(200)에 등록됨으로써 음성 인식 제어 서버(200)와 네트워크 연결될 수 있다.Further, the voice receiving terminal 100 can be network-connected to the voice recognition control server 200 by being registered in the voice recognition control server 200. [

예를 들면, 음성 수신 단말(100)에 구비된 페어링 버튼을 사용자가 눌러 음성 인식 제어 서버(200)와 페어링(Pairing)시키거나, 음성 인식 제어 서버(200)에서 인증 어플리케이션을 실행시켜 음성 수신 단말(100)을 등록시킬 수 있다.For example, a pairing button provided to the voice receiving terminal 100 may be paired with the voice recognition control server 200 by a user, or an authentication application may be executed by the voice recognition control server 200, (100) can be registered.

사물 인터넷 장치(300)는 냉장고(Refrigerator), 전자 오븐(Microwave Oven), 가스 렌지(Gas Range), 전기 밥솥(Electric Rice Cooker), 세탁기(Washing Machine), 에어컨(Air Conditioner), TV(Television) 등을 포함하나 이에 한정되는 것은 아니다.The object Internet apparatus 300 includes a refrigerator, a microwave oven, a gas range, an electric rice cooker, a washing machine, an air conditioner, a television (TV) But are not limited thereto.

또한, 네트워크는 음성 수신 단말(100)과 음성 인식 제어 서버(200) 간 또는 사물 인터넷 장치(300)와 음성 인식 제어 서버(200) 간 정보 교환이 가능한 연결 구조를 의미하는 것으로, 이러한 네트워크의 일 예로, Wi-Fi, 블루투스(Bluetooth), 인터넷(Internet), LAN(Local Area Network), Wireless LAN(Wireless Local Area Network), WAN(Wide Area Network), PAN(Personal Area Network), 3G, 4G, 5G, LTE 등이 포함되나 이에 한정되는 것은 아니다.The network means a connection structure capable of exchanging information between the voice receiving terminal 100 and the voice recognition control server 200 or between the object internet apparatus 300 and the voice recognition control server 200, For example, a wireless LAN, such as Wi-Fi, Bluetooth, Internet, LAN (Local Area Network), Wireless LAN (Local Area Network), WAN (Wide Area Network), PAN (Personal Area Network) 5G, LTE, and the like.

음성 인식 제어 서버(200)는 제1 공간(A1)에 배치되고, 음성 수신 단말(100)은 제1 공간(A1)과 다른 제2 공간(A2)에 배치된다. 예를 들면, 제1 공간(A1)은 집 안 중앙에 위치한 거실일 수 있고, 제2 공간(A2)은 거실 주변에 위치한 방일 수 있으며 이러한 방은 복수 개일 수 있다. 여기서, 음성 수신 단말(100)은 복수 개로 각 방 마다 배치될 수 있다.The voice recognition control server 200 is disposed in the first space A1 and the voice receiving terminal 100 is disposed in the second space A2 different from the first space A1. For example, the first space A1 may be a living room located at the center of the house, the second space A2 may be a room located around the living room, and such a room may be plural. Here, a plurality of voice receiving terminals 100 may be arranged for each room.

사용자가 제2 공간(A2) 내에서 사물 인터넷 장치(200)를 제어하기 위한 음성 명령을 하는 경우, 음성 수신 단말(100)은 사용자의 음성 명령을 수신하고, 음성 인식 제어 서버(200)는 음성 수신 단말(100)로부터 사용자의 음성 명령을 제공 받아 사물 인터넷 장치(300)를 제어한다.When the user makes a voice command for controlling the object Internet apparatus 200 in the second space A2, the voice receiving terminal 100 receives the voice command of the user, and the voice recognition control server 200 transmits voice And controls the object Internet apparatus 300 by receiving a voice command of the user from the receiving terminal 100.

한편, 집 안이 제1 공간(A1) 및 제2 공간(A2)과 같이 공간적으로 분리된 상태에서, 직접 제1 공간(A1)에 배치된 음성 인식 제어 서버(200)를 이용하여 사용자가 제2 공간(A2) 내에서 음성 명령을 하여 제2 공간(A2)에 배치된 사물 인터넷 장치(300)를 제어함에 있어, 음성 인식 제어 서버(200)의 음성 인식률은 현저히 떨어질 수 밖에 없다.On the other hand, in the state where the house is spatially separated as the first space A1 and the second space A2, by using the voice recognition control server 200 directly arranged in the first space A1, The voice recognition rate of the voice recognition control server 200 is significantly reduced in controlling the Internet apparatus 300 disposed in the second space A2 by voice command in the space A2.

이에, 본 발명의 실시예에 따른 사물 인터넷 네트워크 시스템은, 제2 공간(A2)에 배치된 음성 수신 단말(100)을 이용하여 사용자의 음성 명령을 제1 공간(A1)에 배치된 음성 인식 제어 서버(200)로 전달함으로써, 음성 인식 제어 서버(200)의 음성 인식률을 향상시킬 수 있다.Accordingly, the object Internet network system according to the embodiment of the present invention uses the voice receiving terminal 100 disposed in the second space A2 to transmit a voice command of the user to the voice recognition control It is possible to improve the voice recognition rate of the voice recognition control server 200 by transmitting it to the server 200. [

물론, 사용자가 제1 공간(A1) 내에서 제1 공간(A1)에 배치된 사물 인터넷 장치(300)를 제어하는 명령을 하는 경우, 음성 인식 제어 서버(200)가 직접 사용자의 음성 명령을 수신하여 사물 인터넷 장치(300)를 제어할 수 있음은 당연하다.Of course, when the user makes a command to control the object Internet device 300 placed in the first space A1 in the first space A1, the voice recognition control server 200 directly receives the voice command of the user So that it is possible to control the object Internet device 300.

도 2는 본 발명의 실시예에 따른 음성 수신 단말의 구체적인 블록도이고, 도 3은 도 2의 음원 수신부가 형성하는 빔 패턴을 설명하기 위한 그래프이고, 도 4는 도 2의 음원 수신부가 복수의 방향에 대해 순차적으로 형성하는 빔을 설명하기 위한 도면이다.FIG. 2 is a specific block diagram of a voice receiving terminal according to an embodiment of the present invention, FIG. 3 is a graph for explaining a beam pattern formed by the sound source receiving unit of FIG. 2, Fig. 8 is a view for explaining a beam which is sequentially formed with respect to a direction of a beam.

도 2에 도시한 바와 같이, 본 발명의 실시예에 따른 음성 수신 단말(100)은 음원 수신부(110), 빔포밍부(120), 제1 제어부(130), 신호 변환부(140) 및 출력부(150)를 포함할 수 있다.2, the voice receiving terminal 100 according to the embodiment of the present invention includes a sound source receiving unit 110, a beam forming unit 120, a first control unit 130, a signal converting unit 140, (150).

도 4에 도시한 바와 같이, 음원 수신부(110)는 음원 신호를 수신하는 복수의 마이크로폰(M1~M4)을 포함할 수 있다. 여기서, 복수의 마이크로폰(M1~M4)은 음원을 마주보고 일정한 간격으로 나란히 가로 또는 세로로 배열될 수 있다.As shown in FIG. 4, the sound source receiving unit 110 may include a plurality of microphones M1 to M4 for receiving sound source signals. Here, the plurality of microphones M1 to M4 may be arranged laterally or vertically side by side at regular intervals facing the sound sources.

한편, 도면에는 4개의 마이크로폰(M1~M4)만 도시하였지만, 이에 한정되는 것은 아니며 그 수는 증감될 수 있다.Although only four microphones M1 to M4 are illustrated in the figure, the number of microphones M1 to M4 is not limited to this, and the number thereof may be increased or decreased.

음원 수신부(110)는 음원 신호를 기초로 사용자가 제2 공간(A2) 내부에 존재하는지 판단할 수 있다.The sound source receiving unit 110 can determine whether the user exists in the second space A2 based on the sound source signal.

구체적으로, 음원 수신부(110)는, 복수의 마이크로폰(M1~M4) 중 어느 하나를 통해 음원 신호의 감지 동작을 수행하다가, 음원 신호가 감지되면, 사용자가 제2 공간(A2) 내부에 존재하는 것으로 판단하여 복수의 마이크로폰(M1~M4)을 모두 동작시켜 음원 신호를 수신한다.More specifically, the sound source receiving unit 110 performs a sound source signal sensing operation through any one of the plurality of microphones M1 to M4. When a sound source signal is detected, the user may be present in the second space A2 And operates the plurality of microphones M1 to M4 to receive the sound source signal.

다시 말해, 음원 수신부(110)는 음원 신호가 기준값 미만이면, 복수의 마이크로폰(M1~M4) 중 어느 하나만 동작시켜 음원 신호의 감지 동작을 수행한다. 이 때, 빔포밍부(120)는 강조할 음원 신호가 존재하지 않는 것 즉, 사용자가 제2 공간(A2) 외부에 존재하는 것으로 간주하여 동작하지 않는다.In other words, when the sound source signal is less than the reference value, the sound source receiving unit 110 operates only one of the plurality of microphones M1 to M4 to perform the sound source signal sensing operation. At this time, the beamforming unit 120 does not operate because the sound source signal to be emphasized does not exist, that is, the user exists outside the second space A2.

음원 수신부(110)는 음원 신호가 기준값 이상이면, 사용자가 제2 공간(A2) 내부에 존재하는 것으로 판단하여 복수의 마이크로폰(M1~M4)를 모두 동작시킨다.The sound source receiving unit 110 determines that the user exists in the second space A2 if the sound source signal is equal to or greater than the reference value and operates all of the plurality of microphones M1 to M4.

여기서, 음원 신호는 가정 주파수 20Hz 내지 20000Hz 범위의 주파수를 가질 수 있다.Here, the excitation signal may have a frequency in the range of 20 Hz to 20000 Hz.

빔포밍부(120)는 음원 신호를 발생하는 음원의 방향을 파악하기 위해 빔을 형성한다.The beam forming unit 120 forms a beam to grasp the direction of the sound source generating the sound source signal.

특히, 빔포밍부(120)는 음원 신호를 강조하기 위한 빔을 복수의 방향에 대해 순차적으로 각각 형성한다.In particular, the beam forming unit 120 sequentially forms beams for emphasizing the sound source signals in a plurality of directions, respectively.

즉, 빔포밍부(120)는, 특정 방향에서 입사되는 음원 신호를 강조하는데, 특정 방향의 특정 영역 내에 있는 음원으로부터 발생하는 음원 신호는 집음하고, 특정 방향의 특정 영역을 벗어난 영역에서 발생하는 음원 신호를 억제하여 음원 신호에 지향성 또는 방향성을 갖게 한다.That is, the beamforming unit 120 emphasizes a sound source signal incident in a specific direction. The sound source signal generated from a sound source in a specific region in a specific direction is collected, and a sound source generated in a region outside a specific region in a specific direction The signal is suppressed so that the sound source signal has directivity or directionality.

여기서, 복수의 방향은 제1 내지 제4 방향(D1~D4)을 포함하고, 빔은 일정 간격으로 형성될 수 있다.Here, the plurality of directions include the first to fourth directions D1 to D4, and the beams may be formed at regular intervals.

한편, 제1 내지 제4방향(D1~D4)은 예시적인 것으로, 더 많은 방향으로 나뉠 수 있고 각 방향에 대해 빔을 형성할 수 있으며, 이 경우 분해능이 향상되어 음원의 방향을 더 정확히 파악할 수 있다.In the meantime, the first to fourth directions D1 to D4 are illustrative and can be divided into more directions, and a beam can be formed in each direction. In this case, resolution can be improved and the direction of the sound source can be grasped more accurately have.

빔포밍부(120)는, 알고리즘을 이용한 신호 처리를 통해, 복수의 마이크로폰(M1~M4)의 가중치를 변화시킴으로써 빔의 형성 방향을 변화시킬 수 있다.The beam forming unit 120 can change the beam forming direction by changing the weights of the plurality of microphones M1 to M4 through signal processing using an algorithm.

한편, 도 3에 도시한 바와 같이, 빔 패턴은 음원 수신부(110)로 수신되는 음원 신호의 방향에 따라 취득되는 정도를 이득(Gain)으로 표현할 수 있다.Meanwhile, as shown in FIG. 3, the beam pattern can be expressed by a gain in accordance with the direction of a sound source signal received by the sound source receiving unit 110. [

예를 들어, 음원 수신부(110)에서 정면으로 향하는 90° 방향에 빔을 형성하는 경우, 90° 방향에서 이득이 가장 높아지도록 하며, 다른 방향의 이득은 현저하게 낮아지도록 한다. For example, when the beam is formed in the 90-degree direction toward the front in the sound source receiving unit 110, the gain is made highest in the 90 ° direction, and the gain in the other direction is remarkably lowered.

여기서, 음원 신호의 이득이 가장 큰 방향에서 형성된 빔을 메인로브(mainlobe)라고 하며, 그밖에 다른 방향에 낮은 이득으로 형성된 빔을 사이드로브(sidelobe)라고 한다.Here, the beam formed in the direction in which the gain of the excitation signal is the largest is called a mainlobe, and the beam formed in a low gain in the other direction is called a sidelobe.

빔포밍부(120)는 음원 신호를 강조하고자 하는 방향의 메인로브가 최대의 이득을 가지도록 하며, 나머지 방향의 사이드로브가 최소의 이득을 가지도록 한다.The beamforming unit 120 causes the main lobe in a direction in which the sound source signal is to be emphasized to have the maximum gain and the side lobes in the remaining directions to have the minimum gain.

제1 제어부(130)는 빔포밍부(120)에 의해 강조된 복수의 음원 신호의 크기를 각각 비교하여, 복수의 방향 중 가장 큰 음원 신호가 출력되는 방향을 음원이 위치하는 음원 방향으로 파악하고, 이 음원 방향에 빔을 고정한다.The first controller 130 compares the magnitudes of the plurality of sound source signals emphasized by the beamforming unit 120 and grasps the direction in which the largest sound source signal among the plurality of directions is output as the direction of the sound source where the sound source is located, The beam is fixed in this sound source direction.

예를 들면, 도 4에 도시한 바와 같이, 빔포밍부(120)는 제1 방향 내지 제4 방향(D1~D4)에 순차적으로 각각 빔을 형성하고, 제1 제어부(130)는 각 방향에서 수신된 음원 신호의 크기를 비교한다. 여기서, 제3 방향(D3)에 음원이 위치하고 있어 제3방향(D3)에서의 음원 신호가 가장 크기 때문에, 제1 제어부(130)는 제3 방향(D3)에 빔을 고정한다.For example, as shown in FIG. 4, the beam forming unit 120 sequentially forms beams in the first direction to the fourth direction D1 to D4, and the first control unit 130 forms beams in each direction And compares the size of the received sound source signal. Here, since the excitation source is located in the third direction D3 and the excitation signal in the third direction D3 is the largest, the first control unit 130 fixes the beam in the third direction D3.

또한, 제1 제어부(130)는 음원 방향의 변화 또는 음원 방향에서의 음원 신호의 크기 변화량을 통해 사용자가 제2 공간(A2)으로 들어오고 있는지 또는 사용자가 제2 공간(A2)에서 나가고 있는지 또는 제2 공간(A2)에 상주하는지 판단할 수도 있다.The first controller 130 determines whether the user is entering the second space A2 or the user is leaving the second space A2 through the change in the direction of the sound source direction or the magnitude of the amplitude of the sound source signal in the sound source direction It may be determined whether it resides in the second space A2.

예를 들어, 제1 방향(D1)에 출입문이 있다고 가정하면 빔포밍부(120)가 빔을 제1 내지 제4 방향(D1~D4)에 대해 순차적으로 형성함에 따라, 음원 방향이 제1 방향(D1)에서 제2 내지 제4 방향(D1~D4) 중 어느 하나의 방향으로 변화되면 사용자가 제2 공간(A2)으로 들어오고 있는 것으로 판단할 수 있다.For example, assuming that there is a door in the first direction D1, the beam forming unit 120 sequentially forms the beam in the first to fourth directions D1 to D4, It can be determined that the user is entering the second space A2 if the direction changes from one of the first direction D1 to the fourth direction D1 to D4.

그리고, 음원 방향이 제2 내지 제4 방향(D2~D4) 중 어느 하나의 방향에서 제1 방향(D1)으로 변화되면 사용자가 제2 공간(A2)에서 나가고 있는 것으로 판단할 수 있다.If the sound source direction changes from one of the second to fourth directions D2 to D4 in the first direction D1, it can be determined that the user is leaving the second space A2.

그리고, 음원 방향이 변화가 없거나 제2 내지 제4 방향(D2~D4) 간에만 변화가 있는 경우 사용자가 제2 공간(A2)에 상주하는 것으로 판단할 수 있다If there is no change in the sound source direction or only in the second to fourth directions D2 to D4, it can be determined that the user resides in the second space A2

또한, 음원 수신부(110) 정면에 출입문이 있다고 가정하면, 음원 방향에서의 음원 신호의 크기 변화량이 점점 커지면 사용자가 제2 공간(A2)으로 들어오고 있는 것으로 판단할 수 있고, 음원 방향에서의 음원 신호의 크기 변화량이 점점 작아지면 사용자가 제2 공간(A2)에서 나가고 있는 것으로 판단할 수 있다.If it is assumed that there is an entrance door on the front side of the sound source receiving unit 110, it can be determined that the user is entering the second space A2 when the size change amount of the sound source signal in the sound source direction is gradually increased, If the amount of change in the signal size becomes smaller, it can be determined that the user is leaving the second space A2.

그리고, 음원 방향에서의 음원 신호의 크기 변화량의 크기가 일정한 경우 사용자가 제2 공간(A2)에 상주하는 것으로 판단할 수 있다If the magnitude of the amplitude change of the sound source signal in the sound source direction is constant, it can be determined that the user resides in the second space A2

이와 같이, 본 발명의 실시예에 따른 사물 인터넷 네트워크 시스템은, 사용자의 상황 판단을 통해, 음성 인식 제어 서버(200)가 미리 설정된 상황 별 시나리오로서 사물 인터넷 장치(300)를 제어할 수 있다. As described above, the object Internet network system according to the embodiment of the present invention can control the object Internet apparatus 300 as a scenario in which the voice recognition control server 200 is preset according to the status of the user.

예를 들면, 음성 인식 제어 서버(200)는 사용자가 제2 공간(A2)으로 들어오고 있는 것으로 판단되면 전등을 키고 사용자가 제2 공간(A2)에서 나가고 있는 것으로 판단되면 전등을 끌 수 있다.For example, the voice recognition control server 200 may turn off the lamp when it is determined that the user is entering the second space A2, and may turn off the lamp if the user is determined to be leaving the second space A2.

제1 제어부(130)는 음원 방향에서 출력되는 음원 신호가 사용자의 음성 명령 신호인지 판단한다. 여기서, 음성 명령 신호란 사물 인터넷 장치(300)를 제어하기 위해 사전에 정의된 신호를 의미한다.The first controller 130 determines whether the sound source signal output from the sound source direction is a voice command signal of the user. Here, the voice command signal means a predefined signal for controlling the object Internet device 300. [

신호 변환부(140)는 음원 방향에서 출력되는 음원 신호가 음성 명령 신호로 판단되면 음성 명령 신호를 전기적 신호로 변환한다.The signal converting unit 140 converts the voice command signal into an electrical signal when the sound source signal output from the sound source direction is a voice command signal.

그러나, 음원 신호가 음성 명령 신호로 판단되지 않으면, 음원 수신부(110)가 음원 신호의 감지 동작을 계속 수행하게 된다.However, if the sound source signal is not determined as a voice command signal, the sound source receiving unit 110 continues to perform the sound source signal sensing operation.

출력부(150)는 전기적 신호를 음성 인식 제어 서버(200)로 출력한다.The output unit 150 outputs an electrical signal to the voice recognition control server 200.

음성 수신 단말(100)은 음성 명령 신호를 해석하기 위한 별도의 구성을 구비하지 않고, 단순히 음성 명령 신호를 전기적 신호로 변환하여 음성 인식 제어 서버(200)에 전달하는 역할만 수행할 수 있다.The voice receiving terminal 100 does not have a separate structure for interpreting the voice command signal but may merely convert the voice command signal into an electrical signal and transmit the converted electrical signal to the voice recognition control server 200. [

여기서, 음성 명령 신호의 해석은 후술할 음성 인식 제어 서버(200)에서 수행하게 된다.Here, interpretation of the voice command signal is performed by the voice recognition control server 200 to be described later.

이와 같이, 본 발명의 실시예에 따른 음성 수신 단말(100)은 음성 명령 신호를 해석하기 위한 별도의 구성을 구비하지 않음으로써, 소비 전력 및 생산 원가를 절감할 수 있고, 이를 통해 음성 수신 단말(100)을 소비자(사용자)에게 저렴한 가격으로 공급할 수 있게 된다.As described above, the voice terminal 100 according to the embodiment of the present invention does not have a separate structure for interpreting voice command signals, thereby reducing power consumption and production cost, 100) to a consumer (user) at an inexpensive price.

도 5는 본 발명의 실시예에 따른 음성 수신 단말의 예시적인 사시도이다.5 is an exemplary perspective view of a voice receiving terminal according to an embodiment of the present invention.

도 5에 도시한 바와 같이, 본 발명의 실시예에 따른 음성 수신 단말(100)은 전원 연결부(160)를 더 포함할 수 있다.As shown in FIG. 5, the voice receiving terminal 100 according to the embodiment of the present invention may further include a power connection unit 160.

전원 연결부(160)는, 도 5(a)와 같이 플러그(161)로만 이루어져 음성 수신 단말(100)과 일체로 형성되거나, 도 5(b)와 같이 플러그(161) 및 플러그(161)에 결합되는 USB(Universal Serial Bus) 소켓(162)으로 이루어져 음성 수신 단말(100)의 USB 단자(101)에 결합될 수 있다. 여기서, 도 5(c)에 도시한 바와 같이 전원 연결부(160) 즉, 플러그(161)는 상용 전원을 공급하는 콘센트(10)에 결합됨으로써, 음성 수신 단말(100)에 전원을 공급한다.5A, the power connection unit 160 may be formed integrally with the voice receiving terminal 100 or may be formed integrally with the plug 161 and the plug 161 as shown in FIG. 5B, And a USB (Universal Serial Bus) socket 162 connected to the USB terminal 101 of the voice receiving terminal 100. 5 (c), the power connection unit 160, that is, the plug 161 is coupled to the receptacle 10 for supplying commercial power, thereby supplying power to the voice receiving terminal 100. [

한편, 일반적으로 각 방마다 콘센트(10)가 구비되어 있는데, 음성 수신 단말(100)에 전원을 공급하기 위해, 전원 연결부(160)의 플러그(161)를 각 방에 구비된 콘센트에 끼움으로써 각 방마다 용이하게 음성 수신 단말(100)을 설치할 수 있는 장점이 있다.In order to supply power to the voice receiving terminal 100, the plug 161 of the power connection unit 160 is inserted into an outlet provided in each room, There is an advantage that the voice receiving terminal 100 can be easily installed.

또한, 음성 수신 단말(100)과 전원 연결부(160)가 별개로 형성되어 결합되는 경우, 불량 또는 고장이 발생된 전원 연결부(160)만 따로 교체할 수 있는 장점이 있다.In addition, when the voice receiving terminal 100 and the power connection unit 160 are separately formed and coupled, there is an advantage that only the power connection unit 160 in which a failure or a failure has occurred can be separately replaced.

도 6은 본 발명의 실시예에 따른 음성 인식 제어 서버의 구체적인 블록도이다.6 is a detailed block diagram of a speech recognition control server according to an embodiment of the present invention.

도 6에 도시한 바와 같이, 본 발명의 실시예에 따른 음성 인식 제어 서버(200)는 음성 인식부(210) 및 제2 제어부(220)를 포함할 수 있다.6, the speech recognition control server 200 according to the embodiment of the present invention may include a speech recognition unit 210 and a second control unit 220. [

음성 인식부(210)는, 음성 수신 단말(100)로부터 사용자의 음성 명령 신호에 대응하는 전기적 신호를 제공 받아, 이 전기적 신호에 대응되는 음성 제어 정보를 생성한다.The voice recognition unit 210 receives an electrical signal corresponding to the voice command signal of the user from the voice receiving terminal 100 and generates voice control information corresponding to the electrical signal.

여기서, 음성 제어 정보는, 텍스트 형태로 구성될 수 있으며, 제어할 사물 인터넷 장치(300)를 선택하고 사물 인터넷 장치(300)의 상태를 제어하는 정보일 수 있다.Here, the voice control information may be configured in the form of a text, information to control the status of the things Internet device 300, and the object Internet device 300 to be controlled.

제2 제어부(220)는 음성 제어 정보에 따라 사물 인터넷 장치(300)를 제어한다. 예를 들면, 제2 제어부(220)는 에어컨의 설정 온도를 낮추거나 올리는 음성 제어 정보에 따라 에어컨의 설정 온도를 낮추거나 올리게 된다.The second control unit 220 controls the object Internet apparatus 300 according to the voice control information. For example, the second control unit 220 lowers or raises the set temperature of the air conditioner according to the voice control information for lowering or raising the set temperature of the air conditioner.

한편, 전술한 바와 같이, 음성 수신 단말(100)은 복수의 상기 제2 공간(A2)에 각각 배치될 수 있다. 이 경우, 음성 인식부(210)는 복수의 음성 수신 단말(100)로부터 각각 제공 받은 전기적 신호의 레벨을 기초로 사용자의 위치를 판단할 수 있다. 여기서, 음성 인식부(210)는 음성 명령 신호의 세기를 전기적 신호의 평균값 또는 대표값으로 환산할 수 있으며, 이를 전기적 신호의 레벨로 변환할 수 있다.Meanwhile, as described above, the voice receiving terminal 100 may be disposed in each of the plurality of second spaces A2. In this case, the voice recognition unit 210 can determine the position of the user based on the levels of the electric signals provided from the plurality of voice receiving terminals 100, respectively. Here, the speech recognition unit 210 may convert the strength of the voice command signal into an average value or a representative value of an electrical signal, and may convert the strength of the voice command signal into a level of an electrical signal.

즉, 음성 인식 제어 서버(200)는, 복수의 전기적 신호 중 가장 큰 레벨의 전기적 신호를 전송한 음성 수신 단말(100)이 배치된 제2 공간(A2)에 사용자가 위치하는 것으로 판단할 수 있다.That is, the voice recognition control server 200 can determine that the user is located in the second space A2 in which the voice receiving terminal 100 that has transmitted the electrical signal of the highest level among the plurality of electrical signals is disposed .

또한, 음성 인식 제어 서버(200)는 직접 사용자의 음성 명령 신호를 수신하여 이를 전기적 신호로 변환할 수도 있다. 이 경우, 음성 인식부(210)는 복수의 음성 수신 단말(100)로부터 각각 제공 받은 전기적 신호와, 음성 인식 제어 서버(200)가 변환한 전기적 신호의 레벨을 기초로 사용자의 위치를 판단할 수 있다.Also, the voice recognition control server 200 may directly receive the voice command signal of the user and convert it into an electrical signal. In this case, the voice recognition unit 210 can determine the position of the user based on the levels of the electrical signals provided from the plurality of voice receiving terminals 100 and the electrical signals converted by the voice recognition control server 200 have.

여기서, 음성 인식부(210)는 복수의 전기적 신호 중 레벨이 가장 큰 전기적 신호에 대응되는 음성 제어 정보를 생성하고, 나머지 전기적 신호는 부정확한 신호일 가능성이 높기 때문에 이에 대응되는 음성 제어 정보를 생성하지 않을 수 있다.Here, the voice recognition unit 210 generates voice control information corresponding to an electrical signal having the highest level among a plurality of electrical signals, and the remaining electrical signals are likely to be an incorrect signal, so that voice control information corresponding thereto is not generated .

이를 통해, 음성 인식부(210)는 정확한 음성 제어 정보를 생성하여 사물 인터넷 장치(300)를 제어할 수 있게 된다.Accordingly, the voice recognition unit 210 can generate accurate voice control information and control the object Internet device 300. [

한편, 2개의 전기적 신호의 레벨이 서로 비슷하여 오차 범위 예컨대, 10~20% 내인 경우, 음성 인식부(210)는 2개의 전기적 신호를 각각 전송한 2개의 음성 수신 단말(100)이 각각 배치된 제2 공간(A2) 사이에 사용자가 위치하는 것으로 판단할 수 있고, 이 경우 각각의 전기적 신호에 대응되는 음성 제어 정보를 생성할 수 있다.On the other hand, when the levels of the two electrical signals are close to each other and are within an error range of, for example, 10 to 20%, the voice recognition unit 210 recognizes that two voice receiving terminals 100, It can be determined that the user is located between the second spaces A2, and in this case, the voice control information corresponding to each electrical signal can be generated.

한편, 본 발명의 실시예에 따른 사물 인터넷 네트워크 시스템은, 음성 명령 신호는 감지되었지만, 사용자의 움직임이 감지되지 않는 경우 바로 음성 제어 정보를 생성하지 않고, 사용자의 이동 단말 등을 통해 사용자의 확인을 거쳐 음성 제어 정보를 생성할 수 있다. 예컨대, 사용자가 사물 인터넷 장치(300)를 제어하기 위한 음성 명령을 실제로 하고 있는지 이동 단말로 확인 메시지를 전송할 수 있고, 사용자가 이에 대한 응답이 있는 경우에 한해 음성 제어 정보를 생성할 수 있다.Meanwhile, in the object Internet network system according to the embodiment of the present invention, when the voice command signal is detected but the movement of the user is not detected, the voice control information is not generated but the user's confirmation is made through the mobile terminal of the user To generate voice control information. For example, the user can transmit an acknowledgment message to the mobile terminal whether the user is actually executing a voice command for controlling the object Internet device 300, and the voice control information can be generated only when the user has a response thereto.

이를 통해, 제3 자가 해킹 등을 통해 원격지에서 음성 명령 신호를 송출하여 사물 인터넷 장치(300)를 무단 제어하는 행위를 차단할 수 있다.Accordingly, the third party can block the operation of controlling the object Internet device 300 by transmitting a voice command signal at a remote place through hacking or the like.

도 7은 본 발명의 실시예에 따른 사물 인터넷 네트워크 방법의 흐름도이다.FIG. 7 is a flowchart of a method for controlling an object Internet network according to an embodiment of the present invention.

이하, 도 1 내지 도 7을 참조하여 본 발명의 실시예에 따른 사물 인터넷 네트워크 방법을 설명하되, 전술한 본 발명의 실시예에 따른 사물 인터넷 네트워크 시스템과 동일한 내용에 대한 설명은 생략하겠다.Hereinafter, referring to FIGS. 1 to 7, a method of Internet network of objects according to an embodiment of the present invention will be described, and a description of the same items as those of the Internet network system according to an embodiment of the present invention will be omitted.

도 7에 도시한 바와 같이, 본 발명의 실시예에 따른 사물 인터넷 네트워크 방법은 S10 단계 내지 S80 단계를 포함할 수 있다.As shown in FIG. 7, the object Internet network method according to the embodiment of the present invention may include steps S10 through S80.

먼저, 복수의 마이크로폰(M1~M4) 중 어느 하나를 통해 음원 신호의 감지 동작을 수행한다(S10). 이 때, 음원 신호가 감지되면, 복수의 마이크로폰(M1~M4)을 모두 동작시켜 음원 신호를 수신하고(S20), 음원 신호가 감지되지 않으면 하나의 마이크로폰을 통해 계속 음원 신호의 감지 동작을 수행한다(S10).First, a sound source signal detection operation is performed through any one of the plurality of microphones M1 to M4 (S10). At this time, if a sound source signal is sensed, all of the plurality of microphones M1 to M4 are operated to receive a sound source signal (S20). If the sound source signal is not detected, the sound source signal is continuously sensed through one microphone (S10).

다음, 복수의 마이크로폰(M1~M4)을 통해 수신한 음원 신호를 강조하기 위한 빔을 복수의 방향에 대해 순차적으로 각각 형성한다(S30).Next, beams for emphasizing the sound source signals received through the plurality of microphones M1 to M4 are sequentially formed in a plurality of directions (S30).

다음, 복수의 음원 신호의 크기를 각각 비교하여, 복수의 방향 중 가장 큰 음원 신호가 출력되는 음원 방향에 빔을 고정한다(S40).Next, the magnitudes of the plurality of sound source signals are compared with each other, and the beam is fixed in the sound source direction in which the largest sound source signal among the plurality of directions is output (S40).

다음, 음원 방향에서 출력되는 음원 신호가 사용자의 음성 명령 신호인지 판단한다(S50).Next, it is determined whether the sound source signal output from the sound source direction is a voice command signal of the user (S50).

다음, 음원 방향에서 출력되는 음원 신호가 음성 명령 신호로 판단되면 음성 명령 신호를 전기적 신호로 변환하고(S60), 음원 신호가 음성 명령 신호로 판단되지 않으면, 음원 신호의 감지 동작을 수행하는 단계(S10)로 되돌아간다.Next, if it is determined that the sound source signal output from the sound source direction is a sound command signal, the sound command signal is converted into an electrical signal (S60). If the sound source signal is not determined as a sound command signal, S10).

다음, 전기적 신호를 제공 받아 전기적 신호에 대응되는 음성 제어 정보를 생성하고(S70), 음성 제어 정보에 따라 사물 인터넷 장치를 제어한다(S80).Next, the voice control information corresponding to the electric signal is generated by receiving the electric signal (S70), and the object internet apparatus is controlled according to the voice control information (S80).

이와 같이, 본 발명의 실시예에 따른 사물 인터넷 네트워크 방법은, 제2 공간(A2)에 배치된 음성 수신 단말(100)을 이용하여 사용자의 음성 명령을 제1 공간(A1)에 배치된 음성 인식 제어 서버(200)로 전달함으로써, 음성 인식 제어 서버(200)의 음성 인식률을 향상시킬 수 있다.As described above, the object Internet network method according to the embodiment of the present invention uses a voice receiving terminal 100 arranged in the second space A2 to transmit a voice command of a user to voice recognition It is possible to improve the voice recognition rate of the voice recognition control server 200 by transmitting it to the control server 200. [

본 명세서와 도면에 개시된 본 발명의 실시 예들은 본 발명의 기술 내용을 쉽게 설명하고 본 발명의 이해를 돕기 위해 특정 예를 제시한 것일 뿐이며, 본 발명의 범위를 한정하고자 하는 것은 아니다. 따라서 본 발명의 범위는 여기에 개시된 실시 예들 이외에도 본 발명의 기술적 사상을 바탕으로 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The embodiments of the present invention disclosed in the present specification and drawings are merely illustrative examples of the present invention and are not intended to limit the scope of the present invention in order to facilitate understanding of the present invention. Accordingly, the scope of the present invention should be construed as being included in the scope of the present invention, all changes or modifications derived from the technical idea of the present invention.

100: 음성 수신 단말
110: 음원 수신부
120: 빔포밍부
130: 제1 제어부
140: 신호 변환부
150: 출력부100: voice receiving terminal
110: Sound source receiver
120: beam forming section
130:
140: Signal conversion section
150:

Claims

A sound source receiving unit including a plurality of microphones for receiving a sound source signal;
A beam forming unit for sequentially forming beams for emphasizing the sound source signal in a plurality of directions;
And a controller for comparing the magnitudes of the plurality of sound source signals emphasized by the beamforming section to fix the beam in the direction of the sound source in which the largest sound source signal among the plurality of directions is outputted, Is a voice command signal of the user; And
When the sound source signal output from the sound source direction is the voice command signal, converts the voice command signal into an electrical signal,
And a voice receiving terminal.

The method according to claim 1,
The sound source receiving unit,
The sound source signal is detected through any one of the plurality of microphones, and when the sound source signal is detected, the plurality of microphones are operated to receive the sound source signal
Voice receiving terminal.

The method according to claim 1,
The plurality of directions include first to fourth directions, and the beams are formed at regular intervals
Voice receiving terminal.

The method according to claim 1,
The sound source receiving unit
If the sound source signal is equal to or greater than a reference value,
Voice receiving terminal.

The method according to claim 1,
And an output unit for outputting the electrical signal
Voice receiving terminal.

The method according to claim 1,
Further comprising a power connection formed of a plug coupled to an outlet for supplying commercial power or a USB socket coupled to the plug and the plug
Voice receiving terminal.

A sound source receiving unit including a plurality of microphones for receiving a sound source signal;
A beam forming unit for sequentially forming beams for emphasizing the sound source signal in a plurality of directions;
And a controller for comparing the magnitudes of the plurality of sound source signals emphasized by the beamforming section to fix the beam in the direction of the sound source in which the largest sound source signal among the plurality of directions is outputted, A first controller for determining whether the voice command signal is a voice command signal of the user;
A signal converter for converting the voice command signal into an electrical signal when the voice signal output from the voice source direction is the voice command signal; And
A voice receiving terminal including an output unit for outputting the electrical signal;
A voice recognition unit receiving the electrical signal and generating voice control information corresponding to the electrical signal; And
And a second controller for controlling the object Internet apparatus according to the voice control information,
The Internet network system.

8. The method of claim 7,
Wherein the voice recognition control server is arranged in a first space and the voice receiving terminal is arranged in a second space different from the first space
Things Internet network system.

9. The method of claim 8,
The voice recognition control server
And controlling the Internet device located in the second space
Things Internet network system.

9. The method of claim 8,
The sound source receiving unit
And determines whether the user exists in the second space based on the sound source signal
Things Internet network system.

9. The method of claim 8,
The first control unit
A determination is made as to whether the user is entering the second space or the user is leaving the second space through a change in the sound source direction or a magnitude variation in the sound source signal in the sound source direction
Things Internet network system.

12. The method of claim 11,
The second control unit
And controls the thing Internet device according to whether the user is entering the second space or the user is leaving the second space
Things Internet network system.

8. The method of claim 7,
The voice receiving terminal
The voice recognition control server being connected to the voice recognition control server by being registered in the voice recognition control server
Things Internet network system.

9. The method of claim 8,
The voice receiving terminal
A plurality of second spacers disposed respectively in the plurality of second spaces,
The speech recognition unit
The position of the user is determined based on the level of the electrical signal provided from each of the plurality of audio receiving terminals
Things Internet network system.

15. The method of claim 14,
The voice recognition control server
Receiving the voice command signal and converting it into the electrical signal,
The speech recognition unit
The position of the user is determined based on the level of the electrical signal provided from each of the plurality of voice receiving terminals and the level of the electrical signal converted by the voice recognition control server
Things Internet network system.

16. The method of claim 15,
The speech recognition unit
And generates audio control information corresponding to the electrical signal having the highest level among the plurality of electrical signals
Things Internet network system.

Performing a sound source signal sensing operation through any one of the plurality of microphones;
Operating the plurality of microphones when the sound source signal is sensed;
Receiving the sound source signal through the plurality of microphones;
Sequentially forming beams for emphasizing the sound source signals in a plurality of directions;
Comparing the magnitudes of the plurality of sound source signals and fixing the beams in a sound source direction in which the largest sound source signal among the plurality of directions is outputted;
Determining whether the sound source signal output in the sound source direction is a voice command signal of a user; And
Converting the voice command signal into an electrical signal if the voice signal output from the voice source direction is the voice command signal
The method comprising the steps of:

18. The method of claim 17,
And returning to the step of detecting the sound source signal if the sound source signal is not determined as the voice command signal
Further comprising the steps of:

18. The method of claim 17,
Generating audio control information corresponding to the electrical signal by receiving the electrical signal; And
Controlling the object Internet apparatus according to the voice control information
Further comprising the steps of: