KR101159133B1

KR101159133B1 - Android robot and method for using the android robot

Info

Publication number: KR101159133B1
Application number: KR1020090133126A
Authority: KR
Inventors: 이덕연; 이동욱; 이호길; 박현섭; 손웅희
Original assignee: 한국생산기술연구원
Priority date: 2009-12-29
Filing date: 2009-12-29
Publication date: 2012-06-25
Also published as: KR20110076424A

Abstract

본 발명은 안드로이드 로봇에 관한 것으로서, 복수의 마이크로폰과, 상기 복수의 마이크로폰을 통해 집음되는 신호에서 상대방의 음성 신호를 각각 추출하는 대역 통과 필터와, 상기 추출된 음성 신호를 디지털 신호로 각각 변환하는 코덱과, 상기 디지털 신호로 각각 변환된 음성 신호 간의 위상차 및 크기차를 이용하여 상기 상대방의 위치를 추적하는 음원 프로세서와, 상기 변환된 음성 신호를 분석하고 분석된 음성 신호와 대응하는 제스처 데이터를 제스처 DB에서 조회하고, 상기 조회된 제스처 데이터 및 상기 추적된 위치에 기초하여 로봇 구동부를 제어하는 호스트 프로세서와, 상기 호스트 프로세서의 제어에 따라 구동되는 로봇 구동부를 포함한다. 이러한 본 발명에 의해, 안드로이드 로봇은 상대방의 음성 인식시 단순히 음성만을 인식하지 않고 음성을 내는 상대방의 위치를 파악하여, 상대방의 음성 및 위치에 대응하는 적절한 반응을 할 수 있다.The present invention relates to an Android robot, comprising: a plurality of microphones, a band pass filter for extracting a voice signal of the other party from a signal collected through the plurality of microphones, and a codec for converting the extracted voice signal into a digital signal, respectively And a sound source processor for tracking the location of the other party using a phase difference and magnitude difference between the voice signals converted into the digital signals, and gesture data corresponding to the analyzed voice signal after analyzing the converted voice signal. And a host processor to control the robot driver based on the inquired gesture data and the tracked position, and a robot driver driven under the control of the host processor. According to the present invention, when the Android robot recognizes the other party's voice, the Android robot can grasp the location of the other party who makes the voice and respond appropriately to the other party's voice and location.

안드로이드 로봇, 음원 프로세서, 음원 추적 Android robot, sound processor, sound source tracking

Description

ANDROID ROBOT AND METHOD FOR USING THE ANDROID ROBOT}

본 발명은 안드로이드 로봇에 관한 것이다. 특히 상대방의 음성 및 상대방의 위치를 파악하여 상대방의 음성 및 위치에 적절하게 반응하는 안드로이드 로봇에 관한 것이다. The present invention relates to an android robot. In particular, it relates to an Android robot that grasps the other party's voice and the other party's location and responds appropriately to the other party's voice and location.

안드로이드 로봇에 있어서, 음성을 인식하는 것은 매우 중요한 요소이다. 그러나 종래 안드로이드 로봇은 유선 마이크 한 개를 이용하여 음성을 인식하고 있으며 이로 인하여 말하는 객체에 대한 방향성을 가지지 못하고 어디서 소리가 나는지 확인할 수 없는 문제가 있었다. For Android robots, speech recognition is a very important factor. However, the conventional Android robot recognizes the voice by using one wired microphone, which does not have a direction to the talking object and there is a problem in which the sound cannot be confirmed.

이러한 문제점을 해결하기 위해 음원의 방향을 추적하는 안드로이드 로봇가 개발되었으나, 종래의 안드로이드 로봇은 아날로그 처리부분과 직접 연결된 음원보드에서 음원을 추출하도록 구성되어 있어 음성신호에 노이즈 성분이 많았으며, 노이즈 성분이 많은 음성신호를 증폭하면 노이즈 성분도 같이 증폭되는 결과가 되어 음성신호를 구분하기가 어려웠고 그 결과 추적된 음원의 방향값이 일정하지 않은 문제가 있었다. 따라서, 노이즈를 최소화하여 보다 정확한 음원의 방향값을 도출할 수 있는 기술이 요청되었다. In order to solve this problem, an android robot for tracking the direction of a sound source has been developed, but the conventional android robot is configured to extract a sound source from a sound source board directly connected to an analog processing part, and thus a lot of noise components are included in a voice signal. When amplifying many voice signals, noise components are amplified as well, and thus it is difficult to distinguish the voice signals. Accordingly, a technique for minimizing noise to derive a more accurate direction value of a sound source has been requested.

또한, 종래의 안드로이드 로봇은, 마이크 입력부터 호스트 프로세서까지 아날로그 라인으로 연결되어 있어서 음성 열화 현상이 심하다는 문제가 있었고, 로봇의 음성, 모션 등의 모든 제어를 호스트 프로세서가 처리하도록 하여 호스트 프로세서에서 로봇 모션과 음성을 동시에 처리하면 많은 부하로 인하여 모션 및 음성에 출력품질이 저하되는 문제가 있어, 이러한 문제점을 해결하는 기술이 요청되었다. In addition, the conventional Android robot has a problem that the voice degradation is severe because the analog line is connected from the microphone input to the host processor, and the host processor handles all control such as voice and motion of the robot so that the host processor can handle the robot. When processing the motion and voice at the same time, there is a problem that the output quality is degraded in the motion and the voice due to a lot of load, a technology that solves this problem has been requested.

본 발명은 상기 문제점을 해결하기 위해 창출된 것으로, 본 발명의 목적은, 상대방의 음성 및 상대방의 위치를 추적하여 반응하는 안드로이드 로봇을 제공하는 것이다.The present invention was created to solve the above problems, and an object of the present invention is to provide an android robot that reacts by tracking the voice of the other party and the location of the other party.

본 발명의 다른 목적은 본 발명의 목적은, 상대방의 음성 및 상대방의 위치를 추적하여 반응하는 안드로이드 로봇의 이용방법을 제공하는 것이다.Another object of the present invention is to provide a method of using an android robot that tracks and responds to the voice of the other party and the location of the other party.

본 발명의 또 다른 목적은 상대방의 음성 및 상대방의 위치를 추적하여 반응하는 안드로이드 로봇의 이용 방법을 행하는 프로그램을 기록한 컴퓨터로 읽을 수 있는 매체를 제공하는 것이다.It is still another object of the present invention to provide a computer readable medium recording a program for performing a method of using an android robot that tracks and reacts with a voice of a counterpart and a counterpart's location.

상기의 목적을 달성하기 위한 본 발명에 의한 안드로이드 로봇은 복수의 마이크로폰과, 상기 복수의 마이크로폰을 통해 집음되는 신호에서 상대방의 음성 신호를 각각 추출하는 대역 통과 필터와, 상기 추출된 음성 신호를 디지털 신호로 각각 변환하는 코덱과, 상기 디지털 신호로 각각 변환된 음성 신호 간의 위상차 및 크기차를 이용하여 상기 상대방의 위치를 추적하는 음원 프로세서와, 상기 음성 신호를 분석하여 음성 신호와 대응하는 제스처 데이터를 제스처 DB에서 조회하고, 상기 조회된 제스처 데이터 및 상기 추적된 위치에 기초하여 로봇 구동부를 제어하는 호스트 프로세서와, 상기 호스트 프로세서의 제어에 따라 구동되는 로봇 구동부를 포함하는 것을 특징으로 한다. 여기서, 상기 음원 프로세서는, 상기 호스트 프로세 서로부터 상기 조회된 제스처 데이터를 전달받아 상기 제스처 데이터에 매핑된 로봇 음성 데이터를 음성 DB에서 조회하고, 상기 조회된 로봇 음성 데이터를 스피커를 통해 출력하도록 제어한다. 또한, 상기 대역 통과 필터는, 상기 복수의 마이크로폰을 통해 집음되는 신호에서 1 kHz의 주파수를 중심으로 한 소정 주파수 대역의 신호를 추출하되, 상기 1 kHz의 주파수에서 최대 이득을 갖도록 한다. 또한, 상기 음원 프로세서는, 상기 음성 신호를 CAN(controller area network)을 통해 상기 호스트 프로세서로 전송할 수 있다. 또한, 상기 마이크로폰은, 상기 상대방의 3차원 위치를 추적하기 위해 적어도 3개가 구비되며, 상기 머리의 좌우 양단과 정수리에 구비될 수 있다. 또한, 상기 코덱 및 상기 음원 프로세서는, 상기 머리의 내부에 구비될 수 있다.Android robot according to the present invention for achieving the above object is a plurality of microphones, a band pass filter for extracting the voice signal of the other party from the signals collected through the plurality of microphones, respectively, and the extracted voice signal digital signal A sound source processor for tracking the location of the other party using a codec for converting each signal to a digital signal, a phase difference and a magnitude difference between the voice signal converted into the digital signal, and analyzing the voice signal to gesture gesture data corresponding to the voice signal. And a host processor for inquiring from a DB, controlling the robot driver based on the inquired gesture data and the tracked position, and a robot driver driven under the control of the host processor. Here, the sound source processor receives the inquired gesture data from the host processor and inquires about the robot voice data mapped to the gesture data in a voice DB, and controls to output the inquired robot voice data through a speaker. do. The band pass filter extracts a signal of a predetermined frequency band centered on a frequency of 1 kHz from a signal collected through the plurality of microphones, and has a maximum gain at the frequency of 1 kHz. The sound source processor may transmit the voice signal to the host processor through a controller area network (CAN). In addition, the microphone, at least three are provided to track the three-dimensional position of the other party, may be provided on both the left and right ends of the head and the crown. In addition, the codec and the sound source processor may be provided inside the head.

상기 다른 목적을 달성하기 위한 본 발명에 따른 안드로이드의 이용방법은 복수의 마이크로폰이 집음하는 단계; 대역 통과 필터가 상기 복수의 마이크로폰을 통해 집음된 신호에서 상대방의 음성 신호를 각각 추출하는 단계; 코덱이 상기 각각 추출된 음성 신호를 디지털 신호로 각각 변환하는 단계; 음원 프로세서가 상기 디지털 신호로 각각 변환된 음성 신호 간의 위상차 및 크기차를 이용하여 상기 상대방의 위치를 추적하는 단계; 음원 프로세서가 상기 변환된 음성 신호와 상기 추적된 위치 정보를 호스트 프로세서로 송신하는 단계; 상기 호스트 프로세서가 상기 음성 신호 및 상기 위치 정보를 수신하는 단계; 상기 호스트 프로세서가 상기 수신된 음성 신호를 분석하고, 분석된 음성 신호와 대응하는 제스처 데이터를 제스처 DB에서 조회하는 단계; 상기 호스트 프로세서가 상기 조회된 제스처 데이터 및 상 기 수신된 위치 정보를 이용하여 로봇을 구동시키고, 상기 음원 프로세서가 상기 조회된 제스처 데이터에 매핑된 로봇 음성데이터를 출력하는 단계를 포함하는 것을 특징으로 한다. According to another aspect of the present invention, there is provided a method of using Android, wherein a plurality of microphones are collected; Extracting, by a band pass filter, a voice signal of the other party from the signals collected through the plurality of microphones; Converting each of the extracted voice signals into a digital signal by a codec; Tracking, by a sound source processor, the position of the counterpart using a phase difference and a magnitude difference between the voice signals converted into the digital signals; A sound source processor transmitting the converted voice signal and the tracked position information to a host processor; Receiving, by the host processor, the voice signal and the location information; Analyzing, by the host processor, the received voice signal and querying gesture data corresponding to the analyzed voice signal in a gesture DB; And the host processor driving the robot using the inquired gesture data and the received position information, and outputting the robot voice data mapped to the inquired gesture data by the sound source processor. .

상기 또 다른 목적을 달성하기 위한 본 발명에 따른 안드로이드 로봇의 이용 방법을 행하는 프로그램을 기록한 컴퓨터로 읽을 수 있는 매체는, 복수의 마이크로폰이 집음하는 단계; 대역 통과 필터가 상기 복수의 마이크로폰을 통해 집음된 신호에서 상대방의 음성 신호를 각각 추출하는 단계; 코덱이 상기 각각 추출된 음성 신호를 디지털 신호로 각각 변환하는 단계; 음원 프로세서가 상기 디지털 신호로 각각 변환된 음성 신호 간의 위상차 및 크기차를 이용하여 상기 상대방의 방향 및 위치를 추적하는 단계; 음원 프로세서가 상기 변환된 음성 신호와 상기 추적된 방향 및 위치에 대한 정보를 호스트 프로세서로 송신하는 단계; 상기 호스트 프로세서가 상기 음성 신호 및 상기 방향 및 위치에 대한 정보를 수신하는 단계; 상기 호스트 프로세서가 상기 수신된 음성 신호를 분석하고, 분석된 음성신호와 대응하는 제스처 데이터를 제스처 DB에서 조회하는 단계; 상기 호스트 프로세서가 상기 조회된 제스처 데이터 및 상기 수신된 위치 정보를 이용하여 로봇을 구동시키고, 상기 음원 프로세서가 상기 조회된 제스처 데이터와 대응하는 로봇 음성데이터를 출력하는 단계를 실행시키기 위한 프로그램을 기록될 수 있다. A computer-readable medium having recorded thereon a program for executing a method of using an android robot according to the present invention for achieving the another object comprises the steps of: collecting a plurality of microphones; Extracting, by a band pass filter, a voice signal of the other party from the signals collected through the plurality of microphones; Converting each of the extracted voice signals into a digital signal by a codec; Tracking, by a sound source processor, the direction and position of the counterpart using a phase difference and a magnitude difference between the voice signals converted into the digital signals, respectively; A sound source processor transmitting the converted voice signal and information about the tracked direction and position to a host processor; Receiving, by the host processor, the voice signal and information about the direction and location; Analyzing, by the host processor, the received voice signal, and querying gesture data corresponding to the analyzed voice signal in a gesture DB; A program for the host processor to drive the robot using the inquired gesture data and the received position information, and the sound source processor to output a robot voice data corresponding to the inquired gesture data to be recorded. Can be.

본 발명에 따른 안드로이드 로봇에 의하면, 상대방의 음성 인식시 단순히 음성만을 인식하지 않고 음성을 내는 상대방의 위치를 파악하여, 상대방의 음성 및 위치에 대응하는 적절한 반응을 할 수 있다. 그 결과 안드로이드 로봇은 인간과 보다 자연스러운 의사소통을 할 수 있게 된다. According to the Android robot according to the present invention, when recognizing the voice of the other party, it is possible to grasp the location of the other party who makes a voice instead of simply recognizing the voice and to perform an appropriate response corresponding to the voice and the location of the other party. As a result, Android robots can communicate more naturally with humans.

또한, 본 발명에 따른 안드로이드 로봇에 의하면, 마이크로폰, 코덱, 음원프로세서를 모두 로봇의 머리에 내장시킴으로써 아날로그 라인을 최소화하였고, 그 결과 마이크로폰으로부터 집음된 아날로그 신호에서 노이즈 부분을 최소화할 수 있다. In addition, according to the Android robot according to the present invention, the microphone, the codec, and the sound processor are all embedded in the head of the robot, thereby minimizing the analog line, and as a result, it is possible to minimize the noise portion in the analog signal collected from the microphone.

또한, 본 발명에 따른 안드로이드 로봇에 의하면, 마이크로폰으로부터 집음된 음성신호를 디지털신호로 변환하여 음원 프로세서로 전송함으로써 아날로그 통신 거리를 축소하여 종래기술에 비해 음성 열화 현상을 개선할 수 있다. In addition, according to the Android robot according to the present invention, by converting the voice signal collected from the microphone into a digital signal to be transmitted to the sound source processor, it is possible to reduce the analog communication distance by improving the voice degradation phenomenon compared to the prior art.

또한, 본 발명에 따른 안드로이드 로봇에 의하면, 상대방의 음성에 대응하는 모션, 제스쳐는 호스트 프로세서에서 제어하고, 상대방의 음성에 대응하는 로봇 음성은 음원 프로세서에서 제어하도록 분리함으로써, 호스트 프로세서에서 모두 제어하였던 종래기술에 비해 로봇의 모션 및 음성의 품질을 향상시킬 수 있다.In addition, according to the Android robot according to the present invention, the motion and gesture corresponding to the other party's voice is controlled by the host processor, and the robot voice corresponding to the other party's voice is separated to be controlled by the sound source processor, thereby controlling all the host processors. Compared with the prior art, it is possible to improve the motion and voice quality of the robot.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있으므로, 특정 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나 이는 본 발명을 특정한 실시 형태에 한정하려는 것이 아니며, 본 발명의 사상 및 기술범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 동일한 참조부호를 동일한 구성요소에 대해 사용하였다.Since the present invention can be variously modified and can have various embodiments, specific embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all changes, equivalents, and substitutes included in the spirit and scope of the present invention. In describing the drawings, the same reference numerals are used for the same components.

제1, 제2, A, B 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있 지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리범위를 벗어나지 않으면서 제1구성요소는 제2구성요소로 명명될 수 있고, 유사하게 제2구성요소도 제1구성요소로 명명될 수 있다. "및/또는"이라는 용어는 복수의 관련된 기재 항목들의 조합 또는 복수의 관련된 기재 항목들 중의 어느 항목을 포함한다.Terms such as first, second, A, and B may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component. The term “and / or” includes any combination of a plurality of related description items or any of a plurality of related description items.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprise" or "have" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms, such as those defined in the commonly used dictionaries, should be construed as having meanings consistent with the meanings in the context of the related art, and shall not be construed in ideal or excessively formal meanings unless expressly defined in this application. .

이하 본 발명에 따른 바람직한 실시예를 첨부된 도면을 참조하여 상세하게 설명한다. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 안드로이드 로봇의 구성을 나타내는 블럭도이다. 1 is a block diagram showing the configuration of an Android robot according to the present invention.

도 2는 본 발명에 따른 안드로이드 로봇의 머리 내부를 나타내는 도면이다.2 is a view showing the inside of the head of the Android robot according to the present invention.

도 3 내지 도 5는 본 발명에 따른 안드로이드 로봇에 설치되어 있는 복수의 마이크로폰의 위치를 나타내는 도면이다. 3 to 5 are views showing the positions of a plurality of microphones installed in the android robot according to the present invention.

도 1을 참조하면, 본 발명에 따른 안드로이드 로봇은 복수의 마이크로폰(101a, 101b, 101c)과, 상기 복수의 마이크로폰(101a, 101b, 101c)을 통해 집음되는 신호에서 상대방의 음성 신호를 각각 추출하는 대역 통과 필터(102)와, 상기 추출된 음성 신호를 디지털 신호로 각각 변환하는 코덱(103)과, 상기 디지털 신호로 각각 변환된 음성 신호 간의 위상차 및 크기차를 이용하여 상기 상대방의 위치를 추적하는 음원 프로세서(104)와, 상기 음성 신호를 분석하여 음성 신호와 대응하는 제스처 데이터를 제스처 DB(미도시)에서 조회하고, 상기 조회된 제스처 데이터 및 상기 추적된 위치에 기초하여 로봇 구동부를 제어하는 호스트 프로세서(105)와, 상기 호스트 프로세서의 제어에 따라 구동되는 로봇 구동부(106)를 포함한다.Referring to FIG. 1, an android robot according to the present invention extracts a voice signal of a counterpart from a plurality of microphones 101a, 101b, and 101c and signals collected through the plurality of microphones 101a, 101b, and 101c, respectively. Tracking the position of the other party using a phase difference and magnitude difference between a band pass filter 102, a codec 103 for converting the extracted speech signal into a digital signal, and a speech signal converted into the digital signal, respectively. A host that analyzes the voice signal and analyzes the voice signal to query gesture data corresponding to the voice signal in a gesture DB (not shown), and controls the robot driver based on the inquired gesture data and the tracked position. A processor 105 and a robot driver 106 driven under the control of the host processor.

또한, 여기서, 상기 음원 프로세서(104)는, 상기 호스트 프로세서(105)로부터 상기 조회된 제스처 데이터를 전달받아 상기 제스처 데이터에 매핑된 로봇 음성 데이터를 로봇 음성 DB(미도시)에서 조회하고, 상기 조회된 로봇 음성 데이터를 스피커(107)를 통해 출력하도록 제어할 수 있다. In addition, the sound source processor 104 receives the inquired gesture data from the host processor 105 and inquires robot voice data mapped to the gesture data in a robot voice DB (not shown), and the inquiry The robot voice data may be controlled to be output through the speaker 107.

여기서 상기 대역 통과 필터(103)는, 상기 복수의 마이크로폰을 통해 집음되는 신호에서 1 kHz의 주파수를 중심으로 한 소정 주파수 대역의 신호를 추출하되, 상기 1 kHz의 주파수에서 최대 이득을 갖도록 한다.Here, the band pass filter 103 extracts a signal of a predetermined frequency band centered on a frequency of 1 kHz from a signal collected through the plurality of microphones, and has a maximum gain at the frequency of 1 kHz.

또한, 여기서 상기 음원 프로세서(104)는, 상기 변환된 음성 신호 및 상기 추적된 위치 정보를 CAN(controller area network)을 통해 상기 호스트 프로세서(105)로 전송할 수 있다.Also, the sound source processor 104 may transmit the converted voice signal and the tracked location information to the host processor 105 through a controller area network (CAN).

여기서, 상기 마이크로폰(101a, 101b, 101c), 상기 상대방의 3차원 위치를 추적하기 위해 적어도 3개가 구비되며, 도 5에 나타난 바와 같이, 머리의 좌우 양단과 정수리에 구비될 수 있다.Here, at least three microphones 101a, 101b, and 101c are provided to track the three-dimensional position of the other party, and as shown in FIG. 5, the microphones 101a, 101b, and 101c may be provided at both ends of the head and the parietal.

또한, 상기 코덱 및 상기 음원 프로세서는, 상기 머리의 내부에 구비되어, 아날로그 통신 거리를 최소화할 수 있다.The codec and the sound source processor may be provided inside the head to minimize an analog communication distance.

도 1 내지 도 5를 참조하여 본 발명에 따른 안드로이드 로봇의 각 구성요소를 상세히 살펴본다.1 to 5, each component of the Android robot according to the present invention will be described in detail.

상기 복수의 마이크로폰(101a, 101b, 101c)은 임의의 방향의 음원(상대방)으로부터 신호(소리)를 수집한다. 여기서, 상기 마이크로폰은, 상대방의 3차원 위치를 추적하기 위해 적어도 3개가 구비되어야 한다. 이는 마이크로폰이 놓여있는 기하학적 형상을 이용하여 음원 도달 지연시간을 측정함으로써 음원의 위치를 추정하는 경우, 3차원의 위치를 추적하기 위해 즉 front-back 혼동현상을 방지하기 위해서 최소 3개 이상의 마이크로폰을 이용하는 것이 바람직하기 때문이다. 또한, 3개의 마이크로폰이 구비된 경우, 상기 마이크로폰은, 도 4에 나타난 바와 같이, 안드로이드 로봇의 머리의 좌우 양단과 정수리에 구비될 수 있다. 상기 안드로이드 로봇의 머리 정수리에 설치된 마이크로폰(101a)과 상기 안드로이드 로봇의 머리 좌측 끝에 설치된 마이크로폰(101b)이 떨어진 거리(d1), 상기 안드로이드 로봇의 머리 정수리에 설치된 마이크로폰(101a)과 상기 안드로이드 로봇의 머리 우측 끝에 설치된 마이크로폰(101c)이 떨어진 거리(d2), 상기 안드로이드 로봇의 머리 좌측 끝에 설치된 마이크로폰(101b)과 상기 안드로이드 로봇의 머리 우측 끝에 설치된 마이크로폰(101c)이 떨어진 거리(d3)는 서로 다를 수 있지만, 각각의 거리(d1, d2, d3)가 같아지도록 마이크로폰을 설치하는 것이 바람직하다. The plurality of microphones 101a, 101b, and 101c collect signals (sounds) from sound sources (relatives) in arbitrary directions. Here, at least three microphones are provided to track the three-dimensional position of the other party. When estimating the position of a sound source by measuring the sound source arrival delay time using the geometric shape of the microphone, it is necessary to use at least three microphones to track the three-dimensional position, that is, to prevent front-back confusion. This is because it is preferable. In addition, when three microphones are provided, the microphones may be provided at both left and right ends and the crown of the head of the Android robot, as shown in FIG. 4. The distance (d1) between the microphone 101a installed at the head parietal of the android robot and the microphone 101b installed at the left end of the head of the android robot, the microphone 101a installed at the head parietal of the android robot and the head of the android robot The distance d2 at which the microphone 101c installed at the right end is separated from each other and the distance d3 at which the microphone 101b installed at the left end of the head of the Android robot and the microphone 101c installed at the right end of the Android robot are different may be different. It is preferable to install the microphones so that the distances d1, d2, d3 are equal.

상기 대역 통과 필터(102)는, 상기 복수의 마이크로폰(101a, 101b, 101c)을 통해 집음되는 신호에서 1 kHz의 주파수를 중심으로 한 소정 주파수 대역의 신호를 추출하되, 상기 1 kHz의 주파수에서 최대 이득을 갖도록 한다. 상기 복수의 마이크로폰(101a, 101b, 101c)에서 수집된 신호는 상대방의 음성 뿐 아니라 여러 잡음이 포함되어 있으므로, 잡음 성분을 최소화하기 위해서 상기 복수의 마이크로폰(101a, 101b, 101c)을 통해 집음된 신호는 각각 상기 대역 통과 필터(102)에 의해 필터링된다. 또한, 1 kHz는 사람의 목소리에 가장 가까운 주파수 대역이기 때문에, 상기 대역 통과 필터(102)는 1 kHz의 주파수를 중심으로 한 소정 주파수 대역의 신호를 추출하되, 상기 1 kHz의 주파수에서 최대 이득을 갖도록 하고, 나머지 주파수에서는 이득을 최소화하여 최적으로 상대방의 음성을 인식하도록 하였다.The band pass filter 102 extracts a signal of a predetermined frequency band centered on a frequency of 1 kHz from a signal collected through the plurality of microphones 101a, 101b, and 101c, but at a maximum of the frequency of 1 kHz. Have a gain. Since the signals collected by the plurality of microphones 101a, 101b, and 101c include not only the voice of the other party, but also various noises, the signals collected by the plurality of microphones 101a, 101b, and 101c to minimize noise components. Are each filtered by the band pass filter 102. In addition, since 1 kHz is the frequency band closest to a human voice, the band pass filter 102 extracts a signal of a predetermined frequency band centered on a frequency of 1 kHz, but obtains a maximum gain at the frequency of 1 kHz. In the remaining frequencies, the gain is minimized to optimally recognize the other party's voice.

상기 코덱(103)은, 상기 대역 통과 필터(102)에 의해 추출된 복수의 음성신호(3개 이상)를 각각 디지털신호로 변환한다. 또한 상기 코덱(103)은 변환된 음성신호를 음원프로세서(104)로 송신한다. 상기 코덱(103)이 마이크로폰으로부터 집음된 음성신호를 디지털신호로 변환하여 음원 프로세서로 송신함으로써, 종래기술에 비해 아날로그 통신 거리가 축소되어 음성 열화 현상이 개선된다. The codec 103 converts a plurality of audio signals (three or more) extracted by the band pass filter 102 into digital signals, respectively. In addition, the codec 103 transmits the converted voice signal to the sound source processor 104. The codec 103 converts the voice signal collected from the microphone into a digital signal and transmits the voice signal to the sound source processor, thereby reducing the analog communication distance and improving voice degradation compared to the prior art.

상기 음원 프로세서(104)는 상기 디지털 신호로 각각 변환된 음성 신호 간의 위상차 및 크기차를 이용하여 상기 상대방의 위치를 추적한다. 구체적으로 상기 디지털로 변환된 음성 신호를 주파수 영역으로 변환하고, 상기 주파수 영역으로 변환된 음성신호 간 지연시간을 구한다. 구체적으로 상기 안드로이드 로봇의 머리 정수리에 설치된 마이크로폰(101a)의 음성신호(1번 마이크의 음성신호)와 상기 안드로이드 로봇의 머리 좌측 끝에 설치된 마이크로폰(101b)의 음성신호(2번 마이크의 음성신호) 간, 상기 안드로이드 로봇의 머리 정수리에 설치된 마이크로폰(101a)의 음성신호(1번 마이크의 음성신호)와 상기 안드로이드 로봇의 머리 우측 끝에 설치된 마이크로폰(101c)의 음성신호(3번 마이크의 음성신호) 간, 상기 안드로이드 로봇의 머리 좌측 끝에 설치된 마이크로폰(101b)의 음성신호(2번 마이크의 음성신호)와 상기 안드로이드 로봇의 머리 우측 끝에 설치된 마이크로폰(101c)의 음성신호(3번 마이크의 음성신호)간, 일반화된 상호 상관관계 함수를 이용하여 피크점을 찾아 지연시간 값을 구한다. 이 때 지연시간 값이 양수인 경우에는 기준 마이크보다 먼저 신호가 도달한 것이고, 음수인 경우에는 기준 마이크가 먼저 신호가 도달함을 의미한다. 이어 지연시간 계산부(420)로부터 계산된 상기 마이크 간 지연시간 값들을 이용하여 기존의 경험적인 구간선택 방법 없이 직접적으로 기하학적인 방법에 근거하여 후보되는 여러 개의 추적각도를 얻어낸 후 이들 후보 추적 각도들 중 상대방의 방향에 가장 근사한 것으로 계산되는 2개의 추적각도를 구하고 이들의 평균을 취함으로써 결과적인 추적 각도를 추정하여 상대방의 위치를 최종적으로 추정한다.The sound source processor 104 tracks the position of the counterpart by using a phase difference and a magnitude difference between the voice signals converted into the digital signals, respectively. Specifically, the digitally converted voice signal is converted into a frequency domain, and a delay time between the voice signals converted into the frequency domain is obtained. Specifically, between the voice signal of the microphone 101a installed on the head of the Android robot (voice signal of the first microphone) and the voice signal of the microphone 101b installed on the left end of the head of the Android robot (voice signal of the microphone 2) Between the voice signal of the microphone 101a installed on the head of the Android robot (voice signal of the microphone 1) and the voice signal of the microphone 101c installed on the right end of the head of the Android robot (voice signal of the microphone 3), Generalization between the voice signal of the microphone 101b installed at the left end of the head of the android robot (voice signal of the microphone # 2) and the voice signal of the microphone 101c installed at the right end of the head of the android robot (voice signal of the third microphone). The peak point is found by using the cross correlation function. At this time, if the delay time value is positive, the signal arrives before the reference microphone, and if negative, the reference microphone arrives first. Subsequently, the candidate tracking angles are obtained after obtaining several tracking angles which are candidates based on a geometric method directly without the conventional empirical section selection method using the delay time calculation values calculated by the delay time calculation unit 420. Two tracking angles which are calculated to be the closest to the other's direction are calculated and averaged, and the resulting tracking angle is estimated to finally estimate the opponent's position.

또한, 상기 음원 프로세서(104)는, 상기 변환된 음성 신호와 상기 추적된 위치 정보를 CAN(controller area network)을 통해 상기 호스트 프로세서(105)로 전송한다. 여기서, CAN(controller area network)이란 실시간 제어 응용시스템 내에 있는 센서나 기동장치 등과 같은 주변장치들을 서로 연결해 주는 마이크로 제어기용 직렬 버스 네트워크를 가리킨다. CAN에서는 이더넷 등에서 사용되는 것과 같은 주소 지정 개념은 사용되지 않으며, 메시지는 해당 네트워크에서의 고유한 식별자를 사용하여 네트워크 내에 있는 모든 노드들에게 동시에 뿌려진다. 개개의 노드들은 이 식별자에 기반하여 해당 메시지를 처리할 것인지의 여부를 결정하며. 버스 접근 순서 역시 경쟁원리에 따라 메시지 우선순위를 정한다. 충돌이 감지되었을 때 이더넷에서는 전송이 중단되는 것과는 달리, CAN과 같은 방식을 사용하면 중단 없는 데이터 전송이 가능하다는 장점이 있다. In addition, the sound source processor 104 transmits the converted voice signal and the tracked position information to the host processor 105 through a controller area network (CAN). Here, a controller area network (CAN) refers to a serial bus network for a microcontroller that connects peripheral devices such as sensors or starters in a real-time control application system. In CAN, addressing concepts such as those used in Ethernet are not used, and messages are simultaneously sprayed on all nodes in the network using unique identifiers in that network. Individual nodes decide whether or not to process the message based on this identifier. The bus access order also prioritizes messages according to the competition principle. Unlike Ethernet, which stops transmission when a collision is detected, the advantage of using a CAN-like approach is that it allows for uninterrupted data transmission.

상기 호스트 프로세서(105)는 상기 음원프로세서(104)로부터 상기 변환된 음성 신호 및 상기 추적된 위치 정보를 수신하며, 상기 음성신호를 분석하여 음성 신호와 대응하는 제스처 데이터를 제스처 DB(미도시)에서 조회하고, 상기 조회된 제스처 데이터 및 상기 추적된 위치에 기초하여 로봇 구동부를 제어한다. 구체적으로 상기 호스트 프로세서(105)는 다수개의 상대방의 예상가능한 음성데이터를 저장하고 있는 예상 음성 DB(미도시)와 상기 다수개의 예상가능한 음성데이터와 매핑되어 있는 제스처 데이터를 저장하고 있는 제스처 DB(미도시)가 내장하고 있고, 상기 호스트 프로세서(105)는 상기 음원프로세서(104)로부터 수신한 음성신호를 어절별로 분석하여 키워드를 추출하고, 음성신호에 포함되어 있는 키워드를 가장 많이 포함 하고 있는 상대방의 예상가능한 음성데이터를 예상 음성 DB(미도시)에서 검색하여, 검색된 예상가능한 음성데이터와 매핑된 제스처 데이터를 제스처 DB(미도시)에서 검색하여 출력해낸다. The host processor 105 receives the converted voice signal and the tracked position information from the sound source processor 104, and analyzes the voice signal to obtain gesture data corresponding to the voice signal in a gesture DB (not shown). And query the robot driver based on the inquired gesture data and the tracked position. Specifically, the host processor 105 includes a predicted speech DB (not shown) that stores predictable speech data of a plurality of counterparts and a gesture DB (not shown) which stores gesture data mapped to the plurality of predictable speech data. The host processor 105 analyzes the voice signal received from the sound source processor 104 for each word, extracts a keyword, and extracts the keyword, and includes the most of the keywords included in the voice signal. The predictable speech data is searched in the predicted speech DB (not shown), and the gesture data mapped with the searched predictable speech data is searched in the gesture DB (not shown) and output.

상기 로봇 구동부(106)는, 상기 호스트 프로세서(105)의 제어에 따라 구동된다. 구체적으로, 상기 로봇 구동부(106)는 상기 호스트 프로세서(105)의 제어에 따라 상기 조회된 제스처 데이터에 대응하는 동작을 취하며, 상기 추적된 상대방의 위치를 향하여 회전하는 등의 구동을 한다.The robot driver 106 is driven under the control of the host processor 105. Specifically, the robot driver 106 performs an operation corresponding to the inquired gesture data under the control of the host processor 105, and drives the robot driver to rotate toward the tracked position of the counterpart.

또한, 본 발명에 따른 안드로이드 로봇은 상대방의 음성에 응답하여 로봇의 음성을 출력하기 위해, 상기 음원 프로세서(104)(구체적으로 상기 음원 프로세서(104)의 로봇음성 처리부(320))는, 상기 호스트 프로세서(105)로부터 상기 조회된 제스처 데이터를 전달받아 상기 제스처 데이터에 매핑된 로봇 음성 데이터를 로봇 음성 DB(미도시)에서 조회하고, 상기 조회된 로봇 음성 데이터를 스피커를 통해 출력하도록 제어한다. 구체적으로 상기 음원 프로세서(104)는, 상기 호스트 프로세서(105)로부터 상기 조회된 제스처 데이터를 전달받아 상기 제스처 데이터에 매핑된 로봇 음성 데이터를 로봇 음성 DB(미도시)에서 조회하고, 상기 조회된 로봇 음성 데이터를 코덱(103)으로 송신한다. 상기 코덱(103)은 상기 조회된 로봇 음성 데이터를 수신하여, 수신된 상기 로봇 음성데이터를 아날로그 신호로 변환한다. 아날로그 신호로 변환된 로봇의 음성데이터는 스피커(107)를 통해 출력된다. 여기서, 상기 조회된 제스처 데이터는 CAN(controller area network)을 통해 상기 호스트 프로세서(105)로부터 상기 음원 프로세서(104)로 전송된다.In addition, the Android robot according to the present invention in order to output the voice of the robot in response to the voice of the other party, the sound source processor 104 (specifically, the robot voice processing unit 320 of the sound source processor 104), the host The controller 105 receives the inquired gesture data and inquires about the robot voice data mapped to the gesture data in a robot voice DB (not shown), and controls the robot voice data to be output through a speaker. In detail, the sound source processor 104 receives the inquired gesture data from the host processor 105 and inquires about the robot voice data mapped to the gesture data in a robot voice DB (not shown), and the inquired robot Voice data is transmitted to the codec 103. The codec 103 receives the inquired robot voice data and converts the received robot voice data into an analog signal. The voice data of the robot converted into an analog signal is output through the speaker 107. Here, the inquired gesture data is transmitted from the host processor 105 to the sound source processor 104 through a controller area network (CAN).

도 2에 나타난 바와 같이, 상기 코덱(103) 및 상기 음원 프로세서(104)는 상기 머리의 내부에 구비되는 것이 바람직하다. 이는 로봇의 머리에 설치된 마이크로폰(101a, 101b, 101c) 및 스피커(107)와의 거리를 최소화하여 노이즈를 최소화하기 위함이다.As shown in FIG. 2, the codec 103 and the sound source processor 104 are preferably provided inside the head. This is to minimize the noise by minimizing the distance between the microphone (101a, 101b, 101c) and the speaker 107 installed in the head of the robot.

도 6은 본 발명에 따른 안드로이드 로봇의 이용방법을 보여주는 흐름도이다. 6 is a flowchart illustrating a method of using an android robot according to the present invention.

도 6을 참조하면, 본 발명에 따른 안드로이드 로봇의 이용방법은 먼저 복수의 마이크로폰(101a, 101b, 101c)이 집음한다(S601). 여기서 상기 복수의 마이크로폰(101a, 101b, 101c)은 안드로이드 로봇에 내장되어 있으며, 상대방의 3차원 위치를 추적하기 위해 적어도 3개의 마이크로폰에서 집음하는 것이 바람직하다. 3개의 마이크로폰에서 집음하는 경우, 머리의 좌우 양단과 정수리에 구비된 마이크로폰을 통해 집음할 수 있다. Referring to FIG. 6, in the method of using an android robot according to the present invention, first, a plurality of microphones 101a, 101b, and 101c are collected at step S601. Here, the plurality of microphones 101a, 101b, and 101c are embedded in an Android robot, and it is preferable to collect at least three microphones to track the three-dimensional position of the opponent. When the sound is collected by three microphones, the sound may be collected by microphones provided at both left and right ends of the head and the crown.

다음으로 대역 통과 필터(102)가 상기 복수의 마이크로폰(101a, 101b, 101c)을 통해 집음된 신호에서 상대방의 음성 신호를 각각 추출한다(S602). 구체적으로 상기 대역 통과 필터(102)가 상기 복수의 마이크로폰(101a, 101b, 101c)을 통해 집음되는 신호에서 1 kHz의 주파수를 중심으로 한 소정 주파수 대역의 신호를 추출하되, 상기 1 kHz의 주파수에서 최대 이득을 갖도록 한다.Next, the band pass filter 102 extracts the voice signal of the other party from the signals collected through the plurality of microphones 101a, 101b, and 101c, respectively (S602). Specifically, the band pass filter 102 extracts a signal of a predetermined frequency band centered on a frequency of 1 kHz from a signal collected through the plurality of microphones 101a, 101b, and 101c, but at a frequency of 1 kHz. Try to get the maximum gain.

다음으로 코덱(103)이 상기 각각 추출된 음성 신호를 디지털 신호로 각각 변환한다(S603).Next, the codec 103 converts the extracted voice signals into digital signals, respectively (S603).

다음으로, 음원 프로세서(104)가 상기 디지털 신호로 각각 변환된 음성 신호 간의 위상차 및 크기차를 이용하여 상기 상대방의 위치를 추적한다(S604). 구체적 으로 상기 음원프로세서(104)는 상기 디지털로 변환된 음성 신호를 주파수 영역으로 변환하고, 각 마이크에서 수신된 주파수 영역으로 변환된 음성신호 간의 지연시간 값을 계산하여 계산된 지연시간 값들을 이용하여 기존의 경험적인 구간선택 방법 없이 직접적으로 기하학적인 방법에 근거하여 후보되는 여러 개의 추적각도를 얻어낸 후 이들 후보 추적 각도들 중 상대방의 방향에 가장 근사한 것으로 계산되는 2개의 추적각도를 구하고 이들의 평균을 취함으로써 결과적인 추적 각도를 추정하여 상대방의 위치를 최종적으로 추정한다.Next, the sound source processor 104 tracks the position of the counterpart using the phase difference and the magnitude difference between the voice signals converted into the digital signals (S604). In detail, the sound source processor 104 converts the digitally converted voice signal into a frequency domain, calculates a delay time between voice signals converted into the frequency domain received by each microphone, and uses the calculated delay time values. After obtaining several tracking angles which are candidates based on the geometric method directly without the conventional empirical section selection method, two tracking angles which are calculated as the closest to the opponent's direction among these candidate tracking angles are calculated and averaged. By taking the resulting tracking angle and finally estimating the opponent's position.

다음으로 음원 프로세서(104)가 상기 변환된 음성 신호와 상기 추적된 위치 정보를 호스트 프로세서(105)로 송신한다(S605). 구체적으로 상기 음원 프로세서(104)가 상기 변환된 음성 신호를 CAN(controller area network)을 통해 상기 호스트 프로세서(105)로 송신한다. Next, the sound source processor 104 transmits the converted voice signal and the tracked position information to the host processor 105 (S605). In detail, the sound source processor 104 transmits the converted voice signal to the host processor 105 through a controller area network (CAN).

다음으로 상기 호스트 프로세서(105)가 상기 음성 신호 및 상기 위치 정보를 수신한다(S606). Next, the host processor 105 receives the voice signal and the location information (S606).

다음으로, 상기 호스트 프로세서(105)가 상기 수신된 음성 신호를 분석하고, 분석된 음성 신호와 대응하는 제스처 데이터를 제스처 DB(미도시)에서 조회한다(S607). 구체적으로 다수개의 상대방의 예상가능한 음성데이터를 저장하고 있는 예상 음성 DB(미도시)와 상기 다수개의 예상가능한 음성데이터와 매핑되어 있는 제스처 데이터를 저장하고 있는 제스처 DB(미도시)가 내장하고 있는 상기 호스트 프로세서(105)는, 상기 음원프로세서(104)로부터 수신한 음성신호를 어절별로 분석하여 키워드를 추출하고, 음성신호에 포함되어 있는 키워드를 가장 많이 포함하고 있 는 상대방의 예상가능한 음성데이터를 예상 음성 DB(미도시)에서 검색하여, 검색된 예상가능한 음성데이터와 매핑된 제스처 데이터를 제스처 DB(미도시)에서 검색하여 출력해낸다. Next, the host processor 105 analyzes the received voice signal and inquires gesture data corresponding to the analyzed voice signal in a gesture DB (not shown) (S607). Specifically, the embedded voice DB (not shown) storing predictable voice data of a plurality of counterparts and a gesture DB (not shown) storing gesture data mapped to the plurality of predictable voice data are embedded. The host processor 105 analyzes the voice signal received from the sound source processor 104 for each word, extracts a keyword, and predicts the predictable voice data of the other party that includes the keyword most included in the voice signal. Searched in the voice DB (not shown), the gesture data mapped to the searchable predictable voice data is searched in the gesture DB (not shown) and output.

다음으로, 상기 호스트 프로세서(105)가 상기 조회된 제스처 데이터 및 상기 수신된 위치 정보를 이용하여 로봇을 구동시키고, 상기 음원프로세서(104)가 상기 조회된 제스처 데이터와 대응하는 로봇 음성데이터를 출력한다(S608). Next, the host processor 105 drives the robot using the inquired gesture data and the received position information, and the sound source processor 104 outputs the robot voice data corresponding to the inquired gesture data. (S608).

단계 S608은 세부적으로, 상기 안드로이드 로봇이 상대방의 음성에 대응하는 제스처 및 상대방의 위치에 적절하게 동작하도록 하기 위해, 상기 호스트 프로세서가 상기 조회된 제스처 데이터 및 상기 수신된 위치 정보를 이용하여 로봇 구동부를 제어하는 단계와, 상기 로봇 구동부가 상기 호스트 프로세서의 제어에 따라 구동되는 단계를 포함한다. In operation S608, in detail, the host processor uses the inquired gesture data and the received position information to enable the Android robot to properly operate the gesture corresponding to the voice of the other party and the location of the other party. And controlling the robot driver to be driven under the control of the host processor.

또한 단계 S608은, 상기 안드로이드 로봇이 상대방 음성에 대응하는 음성을 출력하도록 하기 위해, 상기 호스트 프로세서가 상기 조회된 제스처 데이터를 상기 음원 프로세서로 송신하는 단계와, 상기 음원 프로세서가 상기 제스처 데이터를 수신하고, 수신된 제스처 데이터에 매핑된 로봇 음성 데이터를 음성 DB(미도시)에서 조회하는 단계와, 상기 음원 프로세서가 상기 조회된 로봇 음성 데이터를 스피커를 통해 출력하는 단계를 포함한다. 구체적으로 상기 호스트 프로세서가 상기 조회된 제스처 데이터를 CAN(controller area network)을 통해 상기 음원 프로세서로 송신하며, 상기 음원 프로세서(104)는, 상기 호스트 프로세서(105)로부터 상기 조회된 제스처 데이터를 수신하여 상기 제스처 데이터에 매핑된 로봇 음성 데이터를 로봇 음성 DB(미도시)에서 조회하고, 상기 조회된 로봇 음성 데이터를 코덱(103)으로 송신한다. 상기 코덱(103)은 상기 조회된 로봇 음성 데이터를 수신하여, 수신된 상기 로봇 음성데이터를 아날로그 신호로 변환하고, 아날로그 신호로 변환된 로봇의 음성데이터는 스피커(107)를 통해 출력된다.In operation S608, the host processor transmits the inquired gesture data to the sound source processor so that the android robot outputs a voice corresponding to the other party's voice, and the sound source processor receives the gesture data. And querying the robot voice data mapped to the received gesture data in a voice DB (not shown), and outputting the retrieved robot voice data through a speaker. Specifically, the host processor transmits the inquired gesture data to the sound source processor through a controller area network (CAN), and the sound source processor 104 receives the inquired gesture data from the host processor 105. The robot voice data mapped to the gesture data is inquired from a robot voice DB (not shown), and the robot voice data is transmitted to the codec 103. The codec 103 receives the inquired robot voice data, converts the received robot voice data into an analog signal, and the voice data of the robot converted into the analog signal is output through the speaker 107.

본 발명에 따른 안드로이드 로봇의 이용 방법을 행하는 프로그램을 기록한 컴퓨터로 읽을 수 있는 매체는, 복수의 마이크로폰(101a, 101b, 101c)이 집음하는 단계(S601)와, 대역 통과 필터(102)가 상기 복수의 마이크로폰(101a, 101b, 101c)을 통해 집음된 신호에서 상대방의 음성 신호를 각각 추출하는 단계(S602)와, 코덱(103)이 상기 각각 추출된 음성 신호를 디지털 신호로 각각 변환하는 단계(S603)와, 음원 프로세서(104)가 상기 디지털 신호로 각각 변환된 음성 신호 간의 위상차 및 크기차를 이용하여 상기 상대방의 위치를 추적하는 단계(S604)와, 음원 프로세서(104)가 상기 변환된 음성 신호와 상기 추적된 위치 정보를 호스트 프로세서(105)로 송신하는 단계(S605)와, 상기 호스트 프로세서(105)가 상기 음성 신호 및 상기 위치 정보를 수신하는 단계(S606)와, 상기 호스트 프로세서(105)가 상기 수신된 음성 신호의 의미를 분석하고, 분석된 음성 신호와 대응하는 제스처 데이터를 제스처 DB(미도시)에서 조회하는 단계(S607)와, 상기 호스트 프로세서(105)가 상기 조회된 제스처 데이터 및 상기 수신된 위치 정보를 이용하여 로봇을 구동시키고, 상기 음원 프로세서(104)가 상기 조회된 제스처 데이터에 대응하는 로봇 음성데이터를 출력하는 단계(S608)를 실행시키기 위한 프로그램이 기록된다.The computer-readable medium which records the program which performs the method of using the Android robot which concerns on this invention is the step (S601) which the several microphone 101a, 101b, 101c collects, and the band pass filter 102 makes the said multiple Extracting the other party's voice signal from the signals collected through the microphones 101a, 101b, and 101c (S602), and the codec 103 converts the extracted voice signals into digital signals, respectively (S603). (S604), and the sound source processor 104 tracks the position of the other party using the phase difference and magnitude difference between the voice signals converted into the digital signal (S604), and the sound source processor 104 converts the converted voice signal. And transmitting the tracked location information to the host processor 105 (S605), receiving, by the host processor 105, the voice signal and the location information (S606), and the host program. The parser 105 analyzes the meaning of the received speech signal and inquires gesture data corresponding to the analyzed speech signal in a gesture DB (not shown) (S607), and the host processor 105 inquires the query. A program for driving the robot using the received gesture data and the received position information, and executing the step (S608) of the sound source processor 104 outputting the robot voice data corresponding to the inquired gesture data. .

여기서, 복수의 마이크로폰(101a, 101b, 101c)이 집음하는 단계(S601)는, 상 기 상대방의 3차원 위치를 추적하기 위해 적어도 3개의 마이크로폰에서 집음하며, 3개의 마이크로폰에서 집음되는 경우, 머리의 좌우 양단과 정수리에 구비된 마이크로폰을 통해 집음할 수 있다. Here, the step (S601) of the plurality of microphones (101a, 101b, 101c) is collected by the at least three microphones to track the three-dimensional position of the other party, when the three microphones are collected, It can be picked up through the microphone provided at both ends and the crown.

여기서, 상기 대역 통과 필터(102)가 상기 복수의 마이크로폰(101a, 101b, 101c)을 통해 집음된 신호에서 상대방의 음성 신호를 각각 추출하는 단계는, 상기 대역 통과 필터(102)가 상기 복수의 마이크로폰을 통해 집음되는 신호에서 1 kHz의 주파수를 중심으로 한 소정 주파수 대역의 신호를 추출하되, 상기 1 kHz의 주파수에서 최대 이득을 갖도록 한다.Here, the step of extracting the voice signal of the other party from the signal collected by the plurality of microphones 101a, 101b, 101c by the band pass filter 102, the band pass filter 102 is the plurality of microphones A signal of a predetermined frequency band centered on the frequency of 1 kHz is extracted from the signal collected through the signal, and has a maximum gain at the frequency of 1 kHz.

여기서, 상기 음원 프로세서(104)가 상기 변환된 음성 신호와 상기 추적된 위치 정보를 호스트 프로세서(105)로 송신하는 단계는, 상기 음원 프로세서(104)가 상기 변환된 음성 신호를 CAN(controller area network)을 통해 상기 호스트 프로세서(105)로 송신한다. The transmitting of the converted voice signal and the tracked position information to the host processor 105 may include transmitting the converted voice signal to the controller area network by the sound source processor 104. ) To the host processor 105.

여기서, 상기 호스트 프로세서(105)가 상기 조회된 제스처 데이터 및 상기 수신된 위치 정보를 이용하여 로봇을 구동시키고, 상기 음원 프로세서(105)가 상기 조회된 제스처 데이터에 대응하는 로봇 음성데이터를 출력하는 단계(S608)는, 상기 안드로이드 로봇이 상대방의 음성에 대응하는 제스처 및 상대방의 위치에 적절하게 동작하도록 하기 위해, 상기 호스트 프로세서(105)가 상기 조회된 제스처 데이터 및 상기 수신된 위치 정보를 이용하여 로봇 구동부(106)를 제어하는 단계와, 상기 로봇 구동부(106)가 상기 호스트 프로세서(105)의 제어에 따라 구동되는 단계를 포함한다. Here, the host processor 105 drives the robot using the inquired gesture data and the received position information, and the sound source processor 105 outputs the robot voice data corresponding to the inquired gesture data. In operation S608, the host processor 105 uses the inquired gesture data and the received location information in order for the android robot to properly operate the gesture corresponding to the voice of the other party and the location of the other party. Controlling the driver 106, and driving the robot driver 106 under the control of the host processor 105.

또한, 상기 호스트 프로세서(105)가 상기 조회된 제스처 데이터 및 상기 수신된 위치 정보를 이용하여 로봇을 구동시키고, 상기 음원 프로세서(104)가 상기 조회된 제스처 데이터에 대응하는 로봇 음성데이터를 출력하는 단계(S608)는, 상기 안드로이드 로봇이 상대방 음성에 대응하는 음성을 출력하도록 하기 위해, 상기 호스트 프로세서(105)가 상기 조회된 제스처 데이터를 상기 음원 프로세서(104)로 송신하는 단계와, 상기 음원 프로세서(104)가 상기 제스처 데이터를 수신하고, 수신된 제스처 데이터와 대응하는 로봇 음성 데이터를 음성 DB(미도시)에서 조회하는 단계와, 상기 음원 프로세서(104)가 상기 조회된 로봇 음성 데이터를 스피커(107)를 통해 출력하는 단계를 포함한다. In addition, the host processor 105 drives the robot using the inquired gesture data and the received position information, and the sound source processor 104 outputs the robot voice data corresponding to the inquired gesture data. In operation S608, the host processor 105 transmits the inquired gesture data to the sound source processor 104 so that the android robot outputs a voice corresponding to the counterpart voice. 104, the gesture data is received, and the robot voice data corresponding to the received gesture data inquiries in a voice DB (not shown), the sound source processor 104 is the speaker 107 to retrieve the inquired robot voice data And outputting through).

상술한 바와 같이, 본 발명에 의하면, 상대방의 음성 인식시 단순히 음성만을 인식하지 않고 음성을 내는 상대방의 위치를 파악하여, 상대방의 음성 및 위치에 대응하는 적절한 반응을 할 수 있다. 그 결과 안드로이드 로봇은 인간과 보다 자연스러운 의사소통을 할 수 있게 된다. As described above, according to the present invention, when recognizing the other party's voice, it is possible to grasp the location of the other party making the voice and not to simply recognize the voice, and to respond appropriately to the other party's voice and the location. As a result, Android robots can communicate more naturally with humans.

또한, 상술한 바와 같이 본 발명에 따른 안드로이드 로봇에 의하면, 마이크로폰, 코덱, 음원프로세서를 모두 로봇의 머리에 내장시킴으로써 아날로그 라인을 최소화하였고, 그 결과 마이크로폰으로부터 집음된 아날로그 신호에서 노이즈 부분을 최소화할 수 있다. In addition, according to the Android robot according to the present invention as described above, by minimizing the analog line by embedding the microphone, codec, and sound processor in the head of the robot, the noise portion in the analog signal collected from the microphone can be minimized as a result. have.

또한, 상술한 바와 같이 본 발명에 의하면, 마이크로폰으로부터 집음된 음성신호를 디지털신호로 변환하여 음원 프로세서로 전송함으로써 아날로그 통신 거리를 축소하여 종래기술에 비해 음성 열화 현상을 개선할 수 있다. In addition, according to the present invention as described above, by reducing the analog communication distance by converting the voice signal collected from the microphone into a digital signal to the sound source processor it is possible to improve the voice degradation phenomenon compared to the prior art.

또한, 상술한 바와 같이 본 발명에 의하면, 상대방의 음성에 대응하는 모션, 제스쳐는 호스트 프로세서에서 제어하고, 상대방의 음성에 대응하는 로봇 음성은 음원 프로세서에서 제어하도록 분리함으로써, 호스트 프로세서에서 모두 제어하였던 종래기술에 비해 로봇의 모션 및 음성의 품질을 향상시킬 수 있다.In addition, as described above, according to the present invention, the motion and gesture corresponding to the other party's voice are controlled by the host processor, and the robot voice corresponding to the other party's voice is separated to be controlled by the sound source processor. Compared with the prior art, it is possible to improve the motion and voice quality of the robot.

도 3 내지 도 5는 본 발명에 따른 안드로이드 로봇에 설치되어 있는 복수의 마이크로폰의 위치를 나타내는 도면으로서, 도 3은 본 발명에 따른 안드로이드 로봇의 정면을 나타내는 도면이며, 도 4는 본 발명에 따른 안드로이드 로봇의 우(右)면을 나타내는 도면이고, 도 5는 본 발명에 따른 안드로이드 로봇의 상(上)면을 나타내는 도면이다. 3 to 5 are views showing the position of a plurality of microphones installed in the android robot according to the present invention, Figure 3 is a view showing the front of the android robot according to the invention, Figure 4 is an android according to the present invention It is a figure which shows the right side of a robot, and FIG. 5 is a figure which shows the upper side of the android robot concerning this invention.

도 6은 본 발명에 따른 안드로이드 로봇의 이용방법을 보여주는 흐름도이다.6 is a flowchart illustrating a method of using an android robot according to the present invention.

[도면 부호에 대한 설명] [Description of Drawing Symbol]

101a, 101b, 101c 마이크 102 대역통과필터 103 코덱101a, 101b, 101c Microphone 102 Bandpass Filter 103 Codec

104 음원 프로세서 105 호스트 프로세서 106 로봇 구동부104 Sound Processor 105 Host Processor 106 Robot Drive

107 스피커107 speakers

Claims

Three microphones arranged to be directional with respect to the sound source,

A band pass filter for extracting a voice signal of the other party from the signals collected through the plurality of microphones;

A codec for converting the extracted voice signal into a digital signal, respectively;

A sound source processor for tracking the position of the counterpart using a phase difference and a magnitude difference between the voice signals converted into the digital signals, respectively;

A host processor that analyzes the converted voice signal, inquires gesture data corresponding to the analyzed voice signal in a gesture DB, and controls a robot driver based on the retrieved gesture data and the tracked position;

A robot driver driven under the control of the host processor,

The band pass filter extracts a signal of a predetermined frequency band centered on a frequency of 1 kHz from a signal collected through the three microphones, and has an maximum gain at the frequency of 1 kHz.

The method of claim 1,

The sound source processor receives the inquired gesture data from the host processor, inquires about the robot voice data mapped to the gesture data in a robot voice DB, and controls the robot voice data to be output through a speaker. Android robot.

delete

The method of claim 1,

And the sound source processor transmits the converted voice signal and the tracked position information to the host processor through a controller area network (CAN).

delete

The method of claim 1,

The microphone is an android robot, characterized in that provided on both the left and right ends of the head and the crown.

The method of claim 1,

The codec and the sound source processor, Android robot, characterized in that provided inside the head.

Collecting three microphones arranged to be directional with respect to the sound source,

A band pass filter extracts a voice signal of the other party from the signals collected through the plurality of microphones, and extracts a signal of a predetermined frequency band centered on a frequency of 1 kHz from the signals collected through the three microphones, Extracting a voice signal to have a maximum gain at the frequency of 1 kHz;

A codec converting the extracted speech signals into digital signals, respectively;

A sound source processor tracking a position of the counterpart using a phase difference and a magnitude difference between the voice signals converted into the digital signals, respectively;

A sound source processor transmitting the converted voice signal and the tracked location information to a host processor;

Receiving, by the host processor, the voice signal and the location information;

Analyzing, by the host processor, the received voice signal and querying gesture data corresponding to the analyzed voice signal in a gesture DB;

And the host processor driving the robot using the inquired gesture data and the received location information, and outputting the robot voice data corresponding to the inquired gesture data by the sound source processor. .

In a computer-readable medium that records a program for using the Android robot,

A computer recording a program for executing the step of the host processor driving the robot using the inquired gesture data and the received position information, and outputting the robot voice data corresponding to the inquired gesture data by the sound source processor; Readable media.