KR20070081989A

KR20070081989A - System and nethod for controlling voice detection of network terminal

Info

Publication number: KR20070081989A
Application number: KR1020060014386A
Authority: KR
Inventors: 박영희; 김현수; 심현식
Original assignee: 삼성전자주식회사
Priority date: 2006-02-14
Filing date: 2006-02-14
Publication date: 2007-08-20
Also published as: US7890334B2; KR100762636B1; US20070201639A1

Abstract

A system and a method for controlling voice detection of a network terminal are provided to detect voice according to a voice detection time point which is optimal to each service, and control server-leaded voice detection to prevent interference due to mixing of a voice detection signal in a terminal and a server. A microphone(102) is opened and closed by a trigger signal, and receives a user voice in an open state. A voice detector(104) detects a voice signal by receiving the user voice from the microphone and transmits the voice signal to the server(200). A setting value setting part(106) stores a default voice detection setting value, and sets a voice detection setting value to the detection value for the service if the voice detection setting value according to the service is received from the server. A master trigger(108) enables the trigger signal according to the voice detection setting value according to the server if a master trigger enable signal is received from the server, and stops generating the trigger signal if a master trigger disable signal is received.

Description

System and method of voice detection control of network terminal {SYSTEM AND NETHOD FOR CONTROLLING VOICE DETECTION OF NETWORK TERMINAL}

도 1은 본 발명의 실시 예에 따른 네트워크 로봇의 음성 검출 제어 시스템의 블록 구성도1 is a block diagram of a voice detection control system of a network robot according to an exemplary embodiment of the present invention.

도 2는 본 발명의 제1 실시 예에 따른 네트워크 로봇의 음성 검출 제어 과정에 대한 흐름도2 is a flowchart illustrating a voice detection control process of a network robot according to the first embodiment of the present invention.

도 3은 본 발명의 제1 실시 예에 따른 네트워크 로봇의 음성 검출 제어 과정 중 로봇의 음성 전송 과정에 대한 흐름도3 is a flowchart illustrating a voice transmission process of the robot during the voice detection control process of the network robot according to the first embodiment of the present invention;

도 4는 본 발명의 제2 실시 예에 따른 네트워크 로봇의 음성 검출 제어 과정에 대한 흐름도4 is a flowchart illustrating a voice detection control process of a network robot according to a second embodiment of the present invention.

도 5는 본 발명의 실시 예에 따른 트리거 발생 과정에 대한 흐름도5 is a flowchart illustrating a trigger generation process according to an embodiment of the present invention.

도 6은 본 발명의 실시 예에 따른 싱글 및 멀티 트리거를 나타낸 도면6 illustrates a single and multi trigger according to an embodiment of the present invention.

도 7은 본 발명의 실시 예에 따른 트리거가 겹쳐 발생하는 경우에 대한 도면7 is a diagram illustrating a case where a trigger is overlapped according to an embodiment of the present invention;

본 발명은 네트워크 단말 시스템에 관한 것으로, 특히 네트워크 단말의 음성 검출을 제어하기 위한 시스템 및 방법에 관한 것이다.The present invention relates to a network terminal system, and more particularly, to a system and method for controlling voice detection of a network terminal.

네트워크 단말 시스템이란 서버가 네트워크를 통해 단말을 제어할 수 있는 시스템을 말한다. 이러한 네트워크 단말 시스템에서 단말은 네트워크를 통한 서버의 제어 신호에 따라 동작하게 된다. The network terminal system refers to a system in which a server can control a terminal through a network. In such a network terminal system, the terminal operates according to a control signal of a server through a network.

이러한 네트워크 단말 시스템에서는 음성 인식을 통한 서비스가 가능한데, 종래 네트워크 단말은 서비스의 특성이나 운영 정책에 따라 마이크를 계속 오픈하여 음성 검출을 계속하거나, 필요할 때마다 마이크를 오픈하는 방법을 이용하여 왔다.In such a network terminal system, a service through voice recognition is possible. Conventionally, a network terminal has continuously used a method of continuously detecting a microphone by opening a microphone according to a characteristic of a service or an operation policy, or opening a microphone whenever necessary.

그런데 종래와 같이 계속해서 음성 입력을 받는 방법은 음성검출 시점이 정확하지 않아 인식성능이 저하될 수 있고, 필요한 음성 이외에도 모든 음성이 네트워크로 전달되기 때문에 사용자가 원하지 않는 음성이 전달되어 프라이버시를 침해할 수 있는 문제점이 있다.However, in the conventional method of continuously receiving a voice input, the recognition performance may be degraded because the voice detection time is not accurate, and all the voices are transmitted to the network in addition to the required voices, so that the voices not desired by the user may be transmitted to invade privacy. There is a problem that can be.

또한 필요할 때만 마이크를 열어 음성 입력을 받아 음성을 검출하는 방식은 인식 성능은 보장하지만, 음성입력 횟수를 정확히 아는 서비스에서만 사용할 수 있고, 그렇지 않은 시스템에서는 사용자가 계속해서 입력 신호를 보내야하는 불편함이 있다. Also, the method of detecting the voice by opening the microphone only when necessary and detecting the voice guarantees the recognition performance, but it can only be used in a service that knows the exact number of the voice input. have.

또한, 상기 네크워크 단말에서는 음성을 이용한 다양한 서비스의 제공이 가능한데, 마이크를 계속 오픈하여 음성 검출을 계속하거나, 필요할 때마다 마이크를 오픈하는 방법과 같이 하나의 방법을 이용하는 것은 서비스마다 다른 방법의 음성 입력이 불가능한 문제점이 있다.In addition, the network terminal can provide a variety of services using voice, using a single method, such as a method of continuously opening the microphone to continue the voice detection, or open the microphone whenever necessary, the voice input of a different method for each service This is an impossible problem.

따라서 상기한 바와 같은 문제점을 해결하기 위해 본 발명은 서비스 마다 최적화된 음성 검출 시점에 따라 음성 검출을 수행하도록 제어하는 네트워크 단말의 음성 검출 시작 제어 시스템 및 방법을 제공하고자 한다.Accordingly, the present invention is to provide a voice detection start control system and method of the network terminal to control to perform the voice detection according to the voice detection time point optimized for each service.

또한 본 발명은 단말과 서버 양쪽에서의 음성 검출 신호의 혼재로 인한 꼬임을 방지하기 위하여 서버 주도의 음성 검출 시점을 제어할 수 있는 네트워크 단말의 음성 검출 시작 제어 시스템 및 방법을 제공하고자 한다.Another object of the present invention is to provide a voice detection start control system and method of a network terminal capable of controlling server-driven voice detection timing in order to prevent twisting due to mixing of voice detection signals at both the terminal and the server.

상기한 바와 같은 목적을 달성하기 위한 본 발명은 네트워크 단말의 음성 검출 제어 시스템에 있어서, 음성 신호 검출이 요구되면 서비스에 따른 음성 검출 세팅값을 수신하여 셋팅하고, 상기 서비스에 따른 음성 검출 셋팅값에 따라 음성 검출을 위한 트리거 신호를 발생하여 음성을 검출하는 단말과, 상기 단말의 서비스를 판단하여 상기 단말에 해당 서비스에 따른 음성 검출 세팅값을 전송하는 서버를 포함한다.According to the present invention for achieving the above object, in the voice detection control system of a network terminal, if voice signal detection is required, the voice detection setting value according to the service is received and set, and the voice detection setting value according to the service is set. Accordingly, a terminal generating a trigger signal for voice detection and detecting a voice, and a server for determining a service of the terminal and transmitting a voice detection setting value according to the service to the terminal.

또한 본 발명은 네트워크 단말의 음성 검출 제어 방법에 있어서, 단말이 서버로부터 서비스에 따른 음성 검출 셋팅값을 수신하는 과정과, 음성 신호 검출이 요구되면 상기 서비스에 따른 음성 검출 셋팅값에 따라 음성 검출을 위한 트리거 신호를 발생하여 음성을 검출하는 과정을 포함한다.In another aspect, the present invention provides a voice detection control method for a network terminal, the terminal receiving the voice detection setting value according to the service from the server, and if the voice signal detection is required to perform voice detection according to the voice detection setting value according to the service Generating a trigger signal for detecting a voice.

이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예를 상세히 설명한다. 도면에서 동일한 구성요소들에 대해서는 비록 다른 도면에 표시되더라도 가능한 동일한 참조번호 및 부호로 나타내고 있음에 유의해야 한다. 또한, 본 발명을 설명함에 있어서, 관련된 공지기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명은 생략한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. Note that the same components in the drawings are represented by the same reference numerals and symbols as much as possible even if shown in different drawings. In addition, in describing the present invention, when it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted.

이하 본 발명의 실시 예에 따른 네트워크 단말의 음성 검출 제어 시스템을 설명함에 있어서, 네트워크 단말이란 네트워크를 통해 서버에 의해 동작이 제어되는 기기로서 예를 들면 핸드폰, 로봇 등이 될 수 있다. 이하 본 발명의 설명에서는 네트워크 단말을 로봇을 예로 들어 설명한다. 하지만, 네트워크 단말은 네트워크를 통해 서버에 의해 동작이 제어되는 기능을 가진 모든 기기가 될 수 있다.In the following description of the voice detection control system of a network terminal according to an embodiment of the present invention, a network terminal is a device whose operation is controlled by a server through a network, and may be, for example, a mobile phone or a robot. In the following description of the present invention, a network terminal is described as an example of a robot. However, the network terminal may be any device having a function whose operation is controlled by the server through the network.

도 1은 본 발명의 실시 예에 따른 네트워크 단말의 음성 검출 제어 시스템의 블록 구성도이다. 도 1을 참조하면, 네트워크 로봇의 음성 검출 제어 시스템에서 서버(200)는 로봇(100)의 음성 검출이 필요한 경우, 음성 검출 시작을 명령하는 마스터 트리거 신호를 로봇(100)에게 전송한다. 이때 마스터 트리거 신호는 마스터 트리거 인에이블 신호 또는 마스터 트리거 디스에이블 신호가 될 수 있다. 마스터 트리거 인에이블 신호는 로봇(100)의 상태를 외부 또는 내부의 제어에 따라 음성 검출 시작을 가능하게 만드는 것이고, 마스터 트리거 디스에이블 신호는 로봇(100)이 서버(200) 로부터의 마스터 트리거 인에이블 신호를 받을 때가지 음성 검출을 할 수 없는 상태임을 말하는 것이다. 마스터 트리거 인에이블 신호가 수신되면, 로 봇(100)은 음성검출을 시작할 수 있는 상태가 되며, 음성 검출 시작 신호 시 로봇(100)에 저장된 기본 음성 검출 셋팅값에 따라 발생할 수 있다. 또한 서버(200)는 서비스별로 각 서비스에 따른 음성 검출 셋팅값을 저장하며, 로봇(100)의 음성 검출 시, 로봇(100)의 서비스를 판단하여 해당 서비스에 따른 음성 검출 셋팅값을 로봇(100)에게 전송하여 로봇(100)이 서비스에 따라 적합한 음성 검출 셋팅값을 설정하도록 한다. 이때 서비스별 음성 검출 셋팅값은 각 서비스에 따른 음성 검출 횟수, 음성 검출 구간 길이, 음성 검출 불가능 구간 길이가 포함될 수 있다.1 is a block diagram of a voice detection control system of a network terminal according to an exemplary embodiment of the present invention. Referring to FIG. 1, in a voice detection control system of a network robot, when the voice detection of the robot 100 is required, the server 200 transmits a master trigger signal to the robot 100 to command the voice detection start. In this case, the master trigger signal may be a master trigger enable signal or a master trigger disable signal. The master trigger enable signal allows the voice 100 to start voice detection according to external or internal control, and the master trigger disable signal enables the robot 100 to master master trigger from the server 200. This means that voice detection is not possible until a signal is received. When the master trigger enable signal is received, the robot 100 may be in a state capable of starting voice detection, and may occur according to a basic voice detection setting value stored in the robot 100 when the voice detection start signal is received. In addition, the server 200 stores the voice detection setting values for each service for each service. When the voice of the robot 100 is detected, the server 200 determines the service of the robot 100 to determine the voice detection setting values according to the corresponding service. The robot 100 sets the appropriate voice detection setting value according to the service. In this case, the voice detection setting value for each service may include a voice detection frequency, a voice detection interval length, and a voice detection interval length according to each service.

한편, 로봇(100)은 사용자의 스위치 조작 또는 사용자의 음성 호출, 손흔들기 등의 신호, 또는 서버(200)로부터의 음성 검출 시작 요구에 따라 음성 검출을 시작한다. 이때, 로봇(100)의 마스터 트리거(108)는 인에이블 상태여야 한다. 예컨대 로봇(100)은 사용자가 로봇(100)이 외부에 장착된 스위치(터치패드, 리모콘 등)를 조작했을 때 이를 음성 검출 시작 입력으로 판단하고 음성 검출을 시작하거나, 사용자가 로봇의 이름을 부르는 등의 신호를 보냈을 때 음성 검출 시작 입력으로 판단하고 음성 검출을 시작할 수 있다. 또한 로봇(100)은 서버(200)로부터 음성 검출 시작 신호가 수신되었을 때 음성 검출을 시작할 수 있다. 이러한 로봇(100)은 기본 음성 검출 셋팅값을 가지며, 사용자에 의해 음성 검출이 시작되는 경우 기본 음성 검출 셋팅값에 따라 음성 검출 트리거를 발생하여 음성 검출을 수행한다. 또한 로봇(100)은 서버(200)에 의해 음성 검출이 시작되는 경우 서버(200)로부터 서비스에 따른 음성 검출 셋팅값을 수신하고, 서비스에 따른 음성 검출 셋팅값에 따라 트리거를 발생하여 음성 검출을 수행한다. 즉, 로봇(100)은 서비스에 따른 음성 검출 횟수, 음성 검출 구간 길이, 음성 검출 불가능 구간 길이에 따라 트리거를 발생하여 음성 검출을 수행한다. On the other hand, the robot 100 starts voice detection in response to a user's switch operation, a user's voice call, a shake, or the like, or a voice detection start request from the server 200. At this time, the master trigger 108 of the robot 100 should be enabled. For example, the robot 100 determines when the robot 100 operates an externally mounted switch (touch pad, remote controller, etc.) as a voice detection start input and starts voice detection, or the user calls the name of the robot. When a signal such as a signal is sent, the voice detection start input can be judged, and voice detection can be started. In addition, the robot 100 may start voice detection when a voice detection start signal is received from the server 200. The robot 100 has a basic voice detection setting value, and when voice detection is started by a user, a voice detection trigger is generated according to the basic voice detection setting value to perform voice detection. In addition, when the voice detection is started by the server 200, the robot 100 receives the voice detection setting value according to the service from the server 200, and generates a trigger according to the voice detection setting value according to the service to perform voice detection. Perform. That is, the robot 100 generates a trigger according to the number of voice detections according to the service, the length of the voice detection section, and the length of the voice detection impossible section to perform voice detection.

이때 로봇(100)은 서비스에 따라 싱글 트리거(single-trigger) 또는 멀티 트리거(multi-trigger)를 발생할 수 있다. 싱글 트리거는 한번의 음성 검출을 수행하기 위한 것이고, 멀티 트리거는 일정 간격으로 여러번 음성 검출을 수행하는 것이다. 로봇(100)은 싱글 트리거 발생 시 마이크(102)를 일정 시간 또는 음성 검출이 종료되는 시점(EPD:End-Point Detection)까지 한번만 오픈함으로써 한번의 음성 검출이 이루어지도록 하며, 멀티 트리거 발생 시 마이크(102)를 일정기간 동안 오픈했다가 클로즈하는 동작을 반복하여 여러번 음성 검출이 이루어지도록 한다.In this case, the robot 100 may generate a single trigger or a multi-trigger according to a service. The single trigger is for performing one voice detection, and the multi-trigger is for performing voice detection several times at regular intervals. When the single trigger occurs, the robot 100 opens the microphone 102 only once for a predetermined time or until the end of voice detection (EPD: End-Point Detection) so that one voice can be detected. The operation of opening 102 for a predetermined period and then closing the loop is repeated to perform voice detection several times.

이하 상기한 바와 같은 서버(200)와 로봇(100)의 구성을 상세히 설명한다. 먼저 로봇(100)의 구성을 설명하면, 로봇(100)은 마이크(102), 음성 검출기(104), 셋팅값 설정부(106), 마스터 트리거부(108)를 포함하여 구성될 수 있다.Hereinafter, the configuration of the server 200 and the robot 100 as described above will be described in detail. First, the configuration of the robot 100 will be described. The robot 100 may include a microphone 102, a voice detector 104, a setting value setting unit 106, and a master trigger unit 108.

마이크(102)는 마스터 트리거 발생부(108)가 인에이블되어 있을 때, 발생된 음성 검출 트리거 신호에 따라 오픈 또는 클로즈되며, 오픈 될 시, 사용자 음성을 입력받아 음성 검출기(104)로 출력한다. 음성 검출기(104)는 마이크(102)로부터 입력된 음성 신호를 검출하고, 검출된 음성 신호를 서버(200)로 전송한다.The microphone 102 is opened or closed according to the generated voice detection trigger signal when the master trigger generator 108 is enabled. When the master trigger generator 108 is enabled, the microphone 102 receives a user voice and outputs it to the voice detector 104. The voice detector 104 detects a voice signal input from the microphone 102 and transmits the detected voice signal to the server 200.

셋팅값 설정부(106)는 기본 음성 검출 셋팅값을 저장하고, 서버(200)로부터 서비스에 따른 음성 검출 셋팅값이 수신되면 음성 검출 셋팅값을 서비스에 따른 음성 검출 셋팅값으로 셋팅한다. The setting value setting unit 106 stores the basic voice detection setting value, and sets the voice detection setting value as the voice detection setting value according to the service when the voice detection setting value according to the service is received from the server 200.

마스터 트리거(master trigger)부(108)는 서버(200)로부터 마스터 트리거 신 호(인에이블 신호 또는 디스에이블 신호)에 따라 상태를 설정한다. 이러한 마스터 트리거부(108)는 서버(200)로부터 마스터 트리거 인에이블 신호가 수신되는 경우, 마스터 트리거부(108)의 상태를 ON시키고, 음성 검출 시작이 가능한 상태로 만들고, 마스터 트리거 디스에이블 신호가 수신되는 경우 마스터 트리거부(108)의 상태를 OFF시켜, 어떠한 경우에도 마이크(102)를 오픈하지 못하게 한다. 이러한 마스터 트리거부(108)는 트리거 발생 시, 셋팅값 설정부(106)에 기본 음성 검출 셋팅값이 셋팅되어 있으면, 기본 음성 검출 셋팅값에 따라 트리거를 발생하고, 서비스에 따른 음성 검출 셋팅값이 셋팅되어 있으면 서비스에 따른 음성 검출 셋팅값에 따라 트리거를 발생한다. The master trigger unit 108 sets a state according to a master trigger signal (an enable signal or a disable signal) from the server 200. When the master trigger enable signal is received from the server 200, the master trigger unit 108 turns on the state of the master trigger unit 108, makes the voice detection start possible, and the master trigger disable signal When received, the state of the master trigger unit 108 is turned off to prevent the microphone 102 from being opened in any case. When the trigger occurs, the master trigger unit 108 generates a trigger according to the basic voice detection setting value when the basic voice detection setting value is set in the setting value setting unit 106, and the voice detection setting value according to the service is set. If set, triggers according to the voice detection setting value according to the service.

한편, 서버(200)의 구성을 설명하면, 서버(200)는 음성인식기(202), 서비스별 셋팅값 저장부(204), 제어기(206)를 포함한다.Meanwhile, referring to the configuration of the server 200, the server 200 includes a voice recognizer 202, a setting value storage unit 204 for each service, and a controller 206.

음성인식기(202)는 로봇(100)으로부터 검출된 음성 신호를 수신하고, 수신된 음성을 인식한다. The voice recognizer 202 receives the voice signal detected from the robot 100 and recognizes the received voice.

서비스별 셋팅값 저장부(204)는 각 서비스마다 미리 정해진 서비스별 음성 검출 셋팅값을 저장한다. 서비스별 음성 검출 셋팅값은 각 서비스에 따른 음성 검출 횟수, 음성 검출 구간 길이, 음성 검출 불가능 구간 길이가 포함될 수 있다.The service setting value storage unit 204 for each service stores a voice detection setting value for each service. The voice detection setting value for each service may include a voice detection frequency, a voice detection interval length, and a voice detection interval length according to each service.

제어기(206)는 상기 로봇(100)의 음성 검출 가능 여부를 결정하여 마스터 트리거 신호를 상기 로봇(100)에게 전송한다. 이때 제어기(206)는 로봇(100)이 수행하는 서비스를 판단하고, 서비스별 셋팅값 저장부(204)에 저장된 셋팅값들 중 로봇(100)이 수행하는 서비스에 따른 음성 검출 셋팅값을 로봇(100)에 전송한다. The controller 206 determines whether the robot 100 can detect voice and transmits a master trigger signal to the robot 100. In this case, the controller 206 determines a service performed by the robot 100, and sets a voice detection setting value according to a service performed by the robot 100 among the setting values stored in the service-specific setting value storage unit 204. 100).

이하 상기한 바와 같이 구성된 로봇(100)과 서버(200)가 네트워크 로봇의 음성 검출 시작 제어를 수행하는 과정을 상세히 설명한다. 본 발명의 실시 예에 따르면 마스터 트리거 인에이블 상태일 때는 음성 검출 시작 제어 과정은 로봇(100)에서 시작될 수도 있고, 서버(200)에서 시작될 수도 있다. 다음부터 설명할 도2, 도4는 설명의 편의를 위해서 마스터 트리거 인에이블 상태일 때를 가정하였다.Hereinafter, the process of performing the voice detection start control of the network robot by the robot 100 and the server 200 configured as described above will be described in detail. According to an exemplary embodiment of the present invention, the voice detection start control process may be started in the robot 100 or may be started in the server 200 when the master trigger is enabled. 2 and 4 to be described later, it is assumed that the master trigger is enabled for convenience of description.

먼저 본 발명의 제1 실시 예에 따른 음성 검출 시작 제어 과정이 로봇(100)에서 시작되는 경우를 설명하면, 도 2는 본 발명의 제1 실시 예에 따른 네트워크 로봇의 음성 검출 시작 제어 과정에 대한 흐름도이다.First, a case in which the voice detection start control process according to the first embodiment of the present invention is started by the robot 100 will be described. FIG. 2 is a flowchart illustrating a voice detection start control process of the network robot according to the first embodiment of the present invention. It is a flow chart.

도 2를 참조하면, 로봇(100)은 212단계에서 음성 검출 시작에 따라 검출된 음성을 전송한다. 이러한 도 212단계의 상세 단계가 도3에 도시되어 있다. 도 3은 본 발명의 제1 실시 예에 따른 네트워크 로봇의 음성 검출 시작 제어 과정 중 로봇의 음성 전송 과정에 대한 흐름도이다.Referring to FIG. 2, the robot 100 transmits the detected voice according to the start of voice detection in step 212. This detailed step of FIG. 212 is shown in FIG. 3 is a flowchart illustrating a voice transmission process of the robot during the voice detection start control process of the network robot according to the first embodiment of the present invention.

도 2의 212단계를 도 3을 참조하여 설명하면, 로봇(100)은 302단계에서 음성 검출 시작 요구가 있는지 판단한다. 예컨대 로봇(100)은 사용자가 로봇(100)이 외부에 장착된 스위치(터치패드, 리모콘 등)를 조작이 있거나, 사용자가 로봇의 이름을 부르면 음성 검출 시작 요구가 있는 것으로 판단할 수 있다.Referring to step 212 of FIG. 2 with reference to FIG. 3, the robot 100 determines whether there is a voice detection start request in step 302. For example, the robot 100 may determine that there is a user's manipulation of a switch (touch pad, remote controller, etc.) in which the robot 100 is externally installed, or that a voice detection start request is made when the user calls the name of the robot.

로봇(100)은 음성 검출 시작 요구가 있으면 304단계에서 마이크(102)를 오픈시켜 음성이 입력되는지 판단한다. 만약 음성이 입력되면 로봇(100)은 310단계에서 입력된 음성(특정 서비스를 요구하는 명령 등)을 검출한다. 그리고 로봇(100)은 312단계에서 검출된 음성을 서버(200)로 전송한다.If there is a voice detection start request, the robot 100 determines whether a voice is input by opening the microphone 102 in step 304. If the voice is input, the robot 100 detects the voice (command for requesting a specific service, etc.) input in step 310. The robot 100 transmits the voice detected in step 312 to the server 200.

한편, 로봇(100)은 음성이 입력되지 않으면 306계에서 미리 저장된 기본 음성 검출 셋팅값에 따라 음성 검출 트리거를 발생한다. 그리고 로봇(100)은 308단계에서 미리 트리거 발생이 종료되는지 판단한다. 이때 트리거 발생 종료는 셋팅값에 따라 이루어질 수 있다.On the other hand, if no voice is input, the robot 100 generates a voice detection trigger according to a default voice detection setting value previously stored in the 306 system. In operation 308, the robot 100 determines whether trigger generation ends. In this case, the trigger generation may be terminated according to a setting value.

만약 트리거 발생이 종료되지 않으면 로봇(100)은 304 내지 312단계를 반복수행하여 트리거 발생이 종료될 때까지 사용자 음성을 입력을 대기하여 일정 기간 동안 마이크(102)를 오픈시켜 음성 검출이 이루어지도록 한다. If the trigger generation does not end, the robot 100 repeatedly performs steps 304 to 312 to wait for input of a user's voice until the trigger generation ends and open the microphone 102 for a predetermined period so that voice detection is performed. .

상기한 바와 같은 도 3의 과정을 통해 서버(200)에 로봇(100)의 음성(특정 서비스를 요구하는 명령 등)이 전달되면 서버(200)는 음성을 인식하고, 도 2의 214단계에서 로봇(100)에 제공되는 혹은 로봇(100)이 수행하는 서비스를 판단한다.When the voice of the robot 100 (command for requesting a specific service, etc.) is transmitted to the server 200 through the process of FIG. 3 as described above, the server 200 recognizes the voice, and the robot in step 214 of FIG. 2. Determine the services provided to or performed by the robot 100.

그리고 서버(200)는 216단계에서 서비스별 셋팅값 저장부(204)에 저장된 셋팅값들 중 로봇(100)에 제공되는 혹은 로봇(100)이 수행하는 서비스에 따른 음성 검출 셋팅값을 로봇(100)에 전송한다. In operation 216, the server 200 sets the voice detection setting value according to a service provided to the robot 100 or performed by the robot 100 among the setting values stored in the setting value storage unit 204 for each service. To be sent).

그러면 로봇(100)은 서버(200)로부터 서비스에 따른 음성 검출 셋팅값을 수신하고, 218단계에서 서비스에 따른 음성 검출 셋팅값을 셋팅한다. 그리고 로봇(100)은 220단계에서 서버(200)로부터 EPD_Start(음성 검출 시작) 신호(220)가 수신되면 222단계로 진행하여 상기 서비스에 따른 음성 검출 셋팅값에 다라 트리거를 발생하여 음성 검출을 수행한다. 이때 로봇(100)은 서비스에 따른 음성 검출 횟수, 음성 검출 구간 길이, 음성 검출 불가능 구간 길이 값에 따라 트리거를 발생하여 음성 검출을 수행한다. Then, the robot 100 receives the voice detection setting value according to the service from the server 200, and sets the voice detection setting value according to the service in step 218. When the EPD_Start (voice detection start) signal 220 is received from the server 200 in step 220, the robot 100 proceeds to step 222 and generates a trigger based on the voice detection setting value according to the service. do. At this time, the robot 100 generates a trigger according to the number of voice detections according to the service, the voice detection section length, and the length of the voice detection impossible section to perform voice detection.

이때 로봇(100)은 서비스에 따라 싱글 트리거(single-trigger) 또는 멀티 트리거(multi-trigger)를 발생할 수 있다. 싱글 트리거는 한번의 음성 검출을 수행하기 위한 것이고, 멀티 트리거는 일정 간격으로 여러번 음성 검출을 수행하는 것이다. 로봇(100)은 싱글 트리거 발생 시 마이크(102)를 일정 시간 또는 음성 검출 종료 시점(EPD:End-Point Detection)까지 한번만 오픈함으로써 한번의 음성 검출이 이루어지도록 하며, 멀티 트리거 발생 시 마이크(102)를 일정기간 동안 오픈했다가 클로즈하는 동작을 반복하여 여러번 음성 검출이 이루어지도록 한다.In this case, the robot 100 may generate a single trigger or a multi-trigger according to a service. The single trigger is for performing one voice detection, and the multi-trigger is for performing voice detection several times at regular intervals. When the single trigger occurs, the robot 100 opens the microphone 102 only once for a predetermined time or until the end point of voice detection (EPD: End-Point Detection) to perform one voice detection, and when the multi-trigger occurs, the microphone 102. To repeat the operation of opening and closing for a period of time to allow the voice detection several times.

한편, 본 발명의 제2 실시 예에 따른 음성 검출 시작 제어 과정이 서버(200)에서 시작되는 경우를 설명하면, 도 4는 본 발명의 제2 실시 예에 따른 네트워크 로봇의 음성 검출 시작 제어 과정에 대한 흐름도이다.Meanwhile, referring to the case in which the voice detection start control process according to the second embodiment of the present invention starts in the server 200, FIG. 4 is a flowchart illustrating the voice detection start control process of the network robot according to the second embodiment of the present invention. This is a flow chart.

도 4를 참조하면, 서버(200)는 402단계에서 음성 검출 시작 이벤트 발생을 인식한다. 즉, 서버(200)는 로봇(100)의 서비스 상태를 계속해서 체크하여 알고 있다가, 음성 인식 이벤트가 발생하면(402) 서버(200)는 404단계에서 로봇(100)이 수행하는 서비스를 판단한다.Referring to FIG. 4, the server 200 recognizes an occurrence of a voice detection start event in step 402. That is, the server 200 continuously checks and knows the service state of the robot 100, and when a voice recognition event occurs (402), the server 200 determines the service performed by the robot 100 in step 404. do.

그리고 서버(200)는 406단계에서 서비스별 셋팅값 저장부(204)에 저장된 셋팅값들 중 로봇(100)에 제공되는 혹은 로봇(100)이 수행하는 서비스에 따른 음성 검출 셋팅값을 로봇(100)에 전송한다. In operation 406, the server 200 may set a voice detection setting value according to a service provided to the robot 100 or performed by the robot 100 among the setting values stored in the setting value storage unit 204 for each service. To be sent).

그러면 로봇(100)은 서버(200)로부터 서비스에 따른 음성 검출 셋팅값을 수신하고, 408단계에서 서비스에 따른 음성 검출 셋팅값을 셋팅한다. 그리고 로봇(100)은 410단계에서 서버(200)로부터 EPD_Start 신호(410)가 수신되면 412단계로 진행하여 상기 서비스에 따른 음성 검출 셋팅값에 다라 트리거를 발생하여 음성 검출을 수행한다. 이때 로봇(100)은 서비스에 따른 음성 검출 횟수, 음성 검출 구간 길이, 음성 검출 불가능 구간 길이 값에 따라 트리거를 발생하여 음성 검출을 수행한다. Then, the robot 100 receives the voice detection setting value according to the service from the server 200, and sets the voice detection setting value according to the service in step 408. When the EPD_Start signal 410 is received from the server 200 in step 410, the robot 100 proceeds to step 412 to generate a trigger based on the voice detection setting value according to the service to perform voice detection. At this time, the robot 100 generates a trigger according to the number of voice detections according to the service, the voice detection section length, and the length of the voice detection impossible section to perform voice detection.

이때 상기한 바와 같은 트리거 발생 과정을 좀더 구체적으로 설명하면, 도 5는 본 발명의 실시 예에 따른 트리거 발생 과정에 대한 흐름도이고, 도 6은 본 발명의 실시 예에 따른 싱글 및 멀티 트리거를 나타낸 도면이다. In this case, the trigger generation process as described above will be described in more detail. FIG. 5 is a flowchart illustrating a trigger generation process according to an embodiment of the present invention, and FIG. 6 is a diagram illustrating single and multi-triggers according to an embodiment of the present invention. to be.

먼저 도 5를 참조하면, 로봇(100)이 음성 검출 시작 신호를 받으면, 502단계에서 마스터 트리거 인에이블 상태인지 확인하고, 504단계에서 셋팅값 설정부(106)에 설정된 서비스에 따른 음성 검출 셋팅값이 싱글 트리거인지 멀티 트리거인지 판단한다. First, referring to FIG. 5, when the robot 100 receives a voice detection start signal, the robot 100 determines whether the master trigger is enabled in step 502, and the voice detection setting value according to the service set in the setting value setting unit 106 in step 504. Determine whether this is a single trigger or a multi trigger.

만약 싱글 트리거이면 로봇(100)은 506단계에서 싱글 트리거 신호를 발생한다. 도 6의 (a)를 참조하면, 싱글 트리거 신호는 음성 입력 가능 구간(A)과 음성 입력 불가능 구간(B)으로 이루어진다. 음성 입력 가능 구간(A)은 마이크(102)를 오픈시키는 구간이고, 음성 입력 불가능 구간(B)은 마이크를 클로즈시키는 구간이다. 로봇(100)은 상기 도 6의 (a)와 같은 싱글 트리거 신호가 발생되면 마이크(102)를 일정 시간 또는 음성 검출 종료 시점(EPD:End-Point Detection)까지 한번만 오픈함으로써 한번의 음성 검출을 수행하게 된다.If it is a single trigger, the robot 100 generates a single trigger signal in step 506. Referring to FIG. 6A, the single trigger signal includes a voice input enable section A and a voice input disable section B. FIG. The voice input available section A is a section in which the microphone 102 is opened, and the voice input impossible section B is a section in which the microphone is closed. When a single trigger signal as shown in FIG. 6A is generated, the robot 100 performs one voice detection by opening the microphone 102 only once for a predetermined time or until end-point detection (EPD). Done.

한편, 멀티 트리거이면 로봇(200)은 508단계에서 멀티 트리거 신호를 발생한다. 도 6의 (b)를 참조하면, 멀티 트리거 신호는 적어도 두개 이상의 다수의 싱글 트리거 신호들로 이루어질 수 있다. 로봇(100)은 상기 도 6의 (b)와 같은 멀티 트리거 신호가 발생되면 마이크(102)를 상기 멀티 트리거 신호에 포함된 음성 입력 가능 구간(A) 수(n회)만큼 여러번 오픈함으로써 음성 검출이 반복적으로 이루어지도록 한다. On the other hand, if the multi-trigger robot 200 generates a multi-trigger signal in step 508. Referring to FIG. 6B, the multi trigger signal may include at least two or more single trigger signals. When the multi-trigger signal as shown in FIG. 6 (b) is generated, the robot 100 detects the voice by opening the microphone 102 as many times as the number (n) of the voice input available sections (A) included in the multi-trigger signal. This is done repeatedly.

한편, 상기한 바와 같은 로봇(100)에서는 이전 트리거 신호에 의한 트리거 신호가 발생되고 있는 도중 다음 트리거 신호가 발생될 수도 있다. 이와 같이 트리거 신호가 겹쳐 발생하는 경우에 대한 도면이 도시된 도 7을 참조하면, 로봇(100)은 이전 트리거 신호(Trigger1)에 의한 트리거 신호가 발생되고 있는 도중 다음 트리거 신호(Trigger2)가 발생하면 이전 트리거(싱글 트리거 또는 멀티 트리거) 발생이 종료될 때까지 다음 트리거 신호를 무시한다.Meanwhile, in the robot 100 as described above, the next trigger signal may be generated while the trigger signal generated by the previous trigger signal is being generated. Referring to FIG. 7, which is a diagram illustrating a case in which a trigger signal overlaps with each other, the robot 100 generates a next trigger signal Trigger2 while a trigger signal generated by a previous trigger signal Trigger1 is generated. Ignore the next trigger signal until the end of the previous trigger (single trigger or multi trigger) occurrence.

상술한 본 발명의 설명에서는 구체적인 실시 예에 관해 설명하였으나, 여러 가지 변형이 본 발명의 범위에서 벗어나지 않고 실시할 수 있다. 따라서 본 발명의 범위는 설명된 실시 예에 의하여 정할 것이 아니고 특허청구범위와 특허청구범위의 균등한 것에 의해 정해 져야 한다. In the above description of the present invention, specific embodiments have been described, but various modifications may be made without departing from the scope of the present invention. Therefore, the scope of the present invention should not be defined by the described embodiments, but should be determined by the equivalent of claims and claims.

상술한 바와 같이 본 발명은 서비스별로 음성 검출을 시작하도록 제어함으로써 서비스에 최적화된 음성 검출 시작이 가능하도록 하는 효과가 있다.As described above, the present invention has an effect of enabling the voice detection start optimized for the service by controlling the service to start the voice detection for each service.

또한 본 발명은 로봇 또는 서버 양쪽에서 음성 검출 시점을 제어함으로써, 네트워크 로봇 시스템의 성능을 향상시킬 수 있는 효과가 있다.In addition, the present invention has the effect of improving the performance of the network robot system by controlling the voice detection point in both the robot or server.

Claims

In the voice detection control system of the network terminal,

A terminal for receiving and setting a voice detection setting value according to a service when voice signal detection is required, and generating a trigger signal for voice detection according to the voice detection setting value according to the service;

And a server for determining a service of the terminal and transmitting a voice detection setting value according to the service to the terminal.

The method of claim 1,

The terminal is a voice detection control system of a network terminal, characterized in that the device is operation controlled through a server.

The method of claim 1,

The voice detection setting value according to the service includes a voice detection number according to the service, a voice detection section length, and a voice detection section length value.

The method of claim 3,

And the terminal generates a single trigger when the number of times of voice detection is one.

The method of claim 3,

And the terminal generates a multi-trigger when the number of times of voice detection is two or more times.

The method of claim 1, wherein the terminal,

The microphone is opened or closed according to the trigger signal and receives a user's voice in the opened state.

A voice detector which receives a user voice from the microphone and detects a voice signal and transmits the detected voice signal to a server;

A setting value setting unit for storing a basic voice detection setting value and setting the voice detection setting value to a voice detection setting value according to a service when a voice detection setting value according to a service is received from the server;

And receiving a master trigger enable signal from the server, making the trigger possible according to the voice detection setting value according to the service, and receiving a master trigger disable signal, and stopping the trigger generation when the master trigger disable signal is received. A voice detection control system for a network terminal.

The method of claim 6, wherein the server,

A speech recognizer for receiving a speech signal detected by the terminal and recognizing the received speech;

A service-specific setting value storage unit for storing a predetermined voice detection setting value for each service for each service;

Determining whether to control the voice detection of the terminal in the server to determine the service performed by the terminal when controlling the voice detection of the terminal in the server, the setting values stored in the setting value storage unit for each service And a control unit which transmits a voice detection setting value according to a corresponding service to the terminal.

The method of claim 1,

And the terminal determines whether or not a voice detection request is required according to whether a user operates a switch.

The method of claim 1,

And the terminal determines whether to request a voice detection according to a voice command of a user.

The method of claim 9,

The server transmits a master trigger enable signal for requesting to start the voice detection when the voice detection of the terminal is required, and transmits a master trigger disable signal when the voice detection of the terminal is terminated. Voice detection control system of a network terminal.

The method of claim 10,

And the terminal starts the voice detection according to the server control when the voice detection start signal is received from the server in the master trigger enable state.

The method of claim 10,

And the terminal terminates the voice detection according to the server control when a master trigger disable signal is received from the server.

The method of claim 10,

And the terminal does not perform voice detection when there is a voice detection request according to a user's request in a master trigger disable state from a server.

The method of claim 1,

And the terminal stores the basic voice detection setting value and, if there is a request for starting voice detection by the user, generates a trigger according to the basic voice detection setting value and detects the voice.

The method of claim 1,

And when the next trigger is generated at a time when the previous trigger generation is not finished, the terminal ignores the next trigger and performs the previous trigger generation.

In the voice detection control method of the network terminal,

Receiving, by the terminal, a voice detection setting value according to the service from the server;

And detecting a voice by generating a trigger signal for voice detection according to the voice detection setting value according to the service if voice signal detection is required.

The method of claim 16,

Receiving, by the terminal, a switch operation of a user;

And determining whether the input switch operation signal is the voice detection request signal.

The method of claim 16,

The robot receiving a user's voice command;

And determining whether a command input by the user is a voice detection request.

The method of claim 16,

And determining that voice detection is required when the voice detection signal for requesting the start of voice detection is received from the server in the master trigger enable state.

The method of claim 16,

And stopping the trigger generation for the voice detection when the terminal receives the master trigger enable signal requesting the voice detection stop from the server.

The method of claim 20,

The terminal further comprises the step of not performing the voice detection if there is a voice detection request according to the user's request in the master trigger disable state from the server.

The method of claim 16,

And when the next trigger is generated when the previous trigger generation is not finished, the terminal ignores the next trigger and performs the previous trigger generation.