KR100943224B1

KR100943224B1 - An intelligent robot for localizing sound source by frequency-domain characteristics and method thereof

Info

Publication number: KR100943224B1
Application number: KR1020070104131A
Authority: KR
Inventors: 곽근창; 배경숙; 이재연; 김혜진; 박범철; 윤호섭
Original assignee: 한국전자통신연구원
Priority date: 2007-10-16
Filing date: 2007-10-16
Publication date: 2010-02-18
Also published as: KR20090038697A

Abstract

본 발명은 다채널 음원보드와 마이크가 장착된 지능형 로봇에서 주파수영역 특성을 이용하여 마이크 간의 지연시간 값과 이들로부터 위치 추적각도를 구하는 방법에 관한 것으로, 음원 추적이 가능한 지능형 로봇은 다채널 음원보드와 마이크를 통해 음성을 취득하는 음성데이터 취득부, 각 마이크로 취득된 음성으로부터 주파수 영역에서의 GCC(Generalized Cross-Correlation)-PHAT(Phase Transform) 기반 음원추적 방법을 이용해 지연시간 값을 구하는 지연시간 처리부, 마이크 간의 지연시간으로부터 후보되는 여러 추적 각도들을 계산하고 이들로부터 신뢰성 있는 추적각도를 추정하는 추적각도 처리부, 추적각도로부터 호출자를 향해 로봇이 회전하는 로봇 구동부를 포함한다. 즉, 본 발명에서는 음원추적장치의 성능척도로써 카메라의 FOV(Field Of View)범위에 의한 추적성공률과 이들의 평균추적오차를 이용하며, 잡음환경이나 반향을 가지는 환경과 근거리 및 원거리(5m 이내)에서 높은 음원추적 성능을 보인다. The present invention relates to a method for obtaining a delay time value between microphones and a location tracking angle from the microphones using frequency domain characteristics in an intelligent robot equipped with a multi-channel sound board and a microphone. Voice data acquisition unit for acquiring voice through a microphone and a delay time processing unit for obtaining a delay time value using a GCC (Generalized Cross-Correlation) -PHAT (Phase Transform) based sound source tracking method in the frequency domain It includes a tracking angle processor for calculating candidate tracking angles from the delay time between the microphones and estimating reliable tracking angles from them, and a robot driving unit for rotating the robot toward the caller from the tracking angles. That is, in the present invention, the tracking success rate by the field of view (FOV) range of the camera and the average tracking error thereof are used as the performance measure of the sound source tracking device, and the environment having noisy or echo, near and far (within 5m) Shows high soundtrack performance.

지능형 로봇, 음원추적, GCC-PHAT, FOV , 추적성공률 Intelligent Robot, Sound Source Tracking, GCC-PHAT, FOV, Tracking Success Rate

Description

Intelligent robots and methods for sound tracking {AN INTELLIGENT ROBOT FOR LOCALIZING SOUND SOURCE BY FREQUENCY-DOMAIN CHARACTERISTICS AND METHOD THEREOF}

본 발명은 지능형 로봇에서 임의의 위치에서 발생한 음원의 방향을 추적하는 방법에 관한 것으로, 특히, 다채널 음원보드와 마이크가 장착된 지능형 로봇에서 주파수영역 특성을 이용하여 마이크간의 지연시간 값과 이들로부터 위치 추적각도를 구하도록 함으로써, 호출자가 어떤 방향(각도)에서 로봇을 호출했는지를 정확히 인지할 수 있도록 하는 음원 추적이 가능한 지능형 로봇 및 이를 이용한 음원 추적 방법에 관한 것이다. The present invention relates to a method for tracking the direction of a sound source generated at an arbitrary position in an intelligent robot, and in particular, the delay time value between microphones and the delay time between the microphones using frequency domain characteristics in an intelligent robot equipped with a multi-channel sound board and a microphone. The present invention relates to an intelligent robot capable of tracking a sound source and a sound source tracking method using the same, by obtaining a location tracking angle so that a caller can accurately recognize in which direction (angle) the robot is called.

본 발명은 정보통신부의 IT신성장동력핵심기술개발 사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호 : 2005-S-033-03, 과제명 : URC를 위한 내장형 컴포넌트 기술개발 및 표준화].The present invention is derived from a study conducted as part of the IT new growth engine core technology development project of the Ministry of Information and Communication [Task management number: 2005-S-033-03, Task name: Development and standardization of embedded component technology for URC].

현재까지, 로봇 환경에서 음원추적기술은 2개의 마이크부터 여러 개의 마이 크를 이용하고 있으며, 특히 2개의 경우에는 휴머노이드 로봇에서 많이 사용되고 있다. 그러나 2개의 마이크를 이용할 경우 로봇의 모든 방향에서 음원추적이 가능하지 않고 0∼180도 사이에서만 음원 추적이 가능하다. 즉, 로봇의 앞과 뒤를 구별할 수 없는 단점과 함께 0도와 180도 부근에서는 특성상 많은 오류를 포함하고 있다. Until now, sound tracing technology has been used from two microphones to several microphones in robot environment, especially in humanoid robots in two cases. However, when two microphones are used, sound tracking is not possible in all directions of the robot, and sound tracking is possible only between 0 and 180 degrees. In other words, the front and rear of the robot can not be distinguished, and many errors are included in the vicinity of 0 degree and 180 degree.

한편, 2차원적인 방향뿐만 아니라 고도를 고려한 3차원의 음원추적에 관한 연구도 많이 진행되고 있다. 하지만, 지능형 서비스 로봇에서는 일반적으로 3차원보다는 2차원적인 정보를 이용한다. 또한, 음원으로써 박수소리, 로봇호칭 등이 일반적으로 사용되고 있으며, 인간 친화적인 음원으로 로봇호칭과 같은 음성에 좀 더 많이 사용되고 있다. On the other hand, a lot of research on the three-dimensional sound source tracking considering the altitude as well as the two-dimensional direction. However, intelligent service robots generally use two-dimensional information rather than three-dimensional information. In addition, as a sound source, claps, robots, and the like are generally used, and human-friendly sound sources are used more and more in voices, such as robots.

지능형 로봇에서 모든 방향의 음원을 추적하기 위해서는 적어도 3개 이상의 마이크가 필요하며, 이런 경우 마이크 간 지연시간으로부터 신뢰성 있는 음원추적 각도를 구하는 방법이 필요하다. 강도 기반에 의한 음원추적의 경우 강도가 큰 두 개의 신호로부터 얻어진 지연시간을 가지고 음원추적 각도를 계산하는 방법이 있지만, 이는 음원보드와 마이크의 정확한 이득(gain)을 맞추기가 어려운 단점을 가지고 있다. 또한, 마이크 간 지연시간 값의 양수 혹은 음수 정보로부터 경험적인 방법에 의해 마이크 간 구간을 선택한 후 선택된 구간으로부터 추적 각도를 구하는 방법이 있으나, 이 또한, 경험적인 방법에 의한 것이기 때문에 신뢰성 있는 추적각도를 구하기가 어려운 문제점을 가지고 있다. In order to track sound sources in all directions in an intelligent robot, at least three microphones are required, and in this case, a method of obtaining a reliable sound source tracking angle from the delay time between microphones is required. In the case of the sound source tracking based on the intensity, there is a method of calculating the sound source tracking angle with the delay time obtained from the two high intensity signals, but it has a disadvantage in that it is difficult to match the accurate gain of the sound source board and the microphone. In addition, there is a method of selecting a section between microphones by empirical method from the positive or negative information of the delay time value between microphones, and then obtaining a tracking angle from the selected section. However, since this is an empirical method, a reliable tracking angle is obtained. There is a problem that is difficult to obtain.

따라서 본 발명은 종래 지능형 로봇에서 음원의 방향을 추적하는데 있어 발생하는 문제점을 해결하기 위해 안출된 것으로, 본 발명의 제 1목적은, 경험적인 구간선택방법이 아닌 구간 선택방법 없이 마이크 간 지연시간 값들로부터 직접적으로 신뢰성 있는 추적각도를 추정하는데 있다. 이를 위해 마이크 간 지연시간 값들로부터 후보되는 여러 개의 추적각도를 구할 수 있는데, 이들 각도 중에서 제일 유사한 두 개의 각도를 구하고 평균을 취함으로써 추적각도를 추정하는 방법이 사용되어진다. 또한 본 발명의 제2 목적은, 일반적으로 시간영역에서의 시간지연방법이 사용하지만, 로봇환경에 적합한 주파수 영역에서의 일반화된 상호상관관계(GCC: Generalized Cross-Correlation)기반의 음원추적을 제공하는데 있다. 좀 더 구체적으로 잡음환경과 반향에 강인한 GCC-PHAT (Phase Transform)방법이 사용되어진다.Therefore, the present invention has been made to solve the problem occurring in tracking the direction of the sound source in the conventional intelligent robot, the first object of the present invention, the delay time between microphones without the section selection method, not the empirical section selection method It is to estimate reliable tracking angle directly from. For this purpose, several tracking angles can be obtained from the delay time values between microphones. A method of estimating the tracking angle is obtained by calculating and averaging two most similar angles among these angles. A second object of the present invention is to provide a sound source tracking based on generalized cross-correlation (GCC) in a frequency domain suitable for a robot environment, although a time delay method in a time domain is generally used. have. More specifically, the GCC-PHAT (Phase Transform) method, which is robust against noise environments and echoes, is used.

상술한 본 발명은 음원 추적이 가능한 지능형 로봇으로서, 로봇이 위치한 일정 영역내 임의의 방향으로부터 음성신호 발생 시 다채널 마이크간 음성신호의 시간지연 값을 계산하여 상기 음성신호가 발생한 음원의 방향 각도를 추적하는 음원 추적부와, 상기 음원 추적부로부터 추적된 각도로 상기 로봇의 방향을 회전시키는 로봇 구동부와, 상기 음원 추적부를 통해 계산된 상기 음원의 방향 각도를 상기 로봇 구동부로 인가하여 상기 로봇이 상기 음원의 방향으로 회전하도록 제어하는 로 봇 제어부를 포함한다. The present invention described above is an intelligent robot capable of tracking a sound source, and calculates a time delay value of a voice signal between multi-channel microphones when a voice signal is generated from any direction within a certain area where the robot is located, thereby determining the direction angle of the sound source where the voice signal is generated. The robot is configured to apply a sound source tracking unit for tracking, a robot driver for rotating the direction of the robot at an angle tracked from the sound source tracking unit, and a direction angle of the sound source calculated through the sound source tracking unit to the robot driver. It includes a robot controller for controlling to rotate in the direction of the sound source.

이때, 상기 음원 추적부는, 상기 다채널 마이크와 음원보드를 통해 상기 음원으로부터 발생되는 음성신호를 취득하는 음성 데이터 취득부와, 상기 취득된 음성신호에 대해 주파수 영역에서의 상호 상관관계를 기반으로 상기 다채널 마이크간 음성신호의 지연시간 값을 계산하는 지연시간 계산부와, 상기 마이크간 음성신호 지연시간 값을 이용하여 기하학적 방법에 근거하여 상기 음원의 방향 각도를 추정하는 추적각도 처리부를 포함하는 것을 특징으로 한다.At this time, the sound source tracking unit, the voice data acquisition unit for acquiring the voice signal generated from the sound source through the multi-channel microphone and the sound source board, and based on the cross correlation in the frequency domain with respect to the acquired voice signal A delay time calculating unit for calculating a delay time value of a multi-channel microphone voice signal, and a tracking angle processor for estimating a direction angle of the sound source based on a geometric method using the voice signal delay time between microphones. It features.

또한, 상기 지연시간 계산부는, 상기 다채널 마이크로부터 얻어진 각 음성신호에 대해 각각 푸리에 변환을 수행하여 가중치 함수 값을 구한 후, 이를 이용하여 상기 음성신호들간 상호 상관값을 계산하여 음성신호들간 지연시간을 계산하는 것을 특징으로 한다.
또한 본 발명은 지능형 로봇에서 음원을 추적하는 방법으로서, (a)다채널 마이크와 음원 보드를 통해 임의의 방향 음원으로부터 발생된 음성신호를 수신하는 단계와, (b)상기 다채널 마이크로 수신된 음성신호들에 대한 주파수 영역에서의 상호상관관계를 기반으로 각 음성신호의 지연시간을 산출하는 단계와, (c)상기 음성신호들간 지연시간 정보를 이용하여 상기 음원의 방향 각도를 산출하는 단계와, (d)상기 음원의 방향 각도로 향하도록 로봇을 회전시키는 단계를 포함하되,
상기 (b)단계는,(b1)상기 다채널 마이크로부터 얻어진 각 음성신호에 대해 각각 푸리에 변환을 수행하여 주파수 가중치 함수 값을 구하는 단계와, (b2)상기 음성신호들간 가중치 함수값을 이용하여 상기 음성신호들간 상호 상관값을 계산하는 단계와, (b3)상기 계산된 상관값의 피크점을 찾아 상기 음성신호들간 상호 지연시간을 산출하는 단계를 포함하는 것을 특징으로 한다.In addition, the delay time calculating unit performs Fourier transform on each of the speech signals obtained from the multi-channel microphone to obtain a weight function value, and then calculates a cross-correlation value between the speech signals using the delay time between the speech signals. It is characterized by calculating.
The present invention also provides a method for tracking a sound source in an intelligent robot, comprising the steps of: (a) receiving a voice signal generated from an arbitrary direction sound source through a multi-channel microphone and a sound source board, and (b) voice received from the multi-channel microphone. Calculating a delay time of each voice signal based on the correlation in the frequency domain with respect to the signals; (c) calculating a direction angle of the sound source using the delay time information between the voice signals; (d) rotating the robot to face the direction angle of the sound source,
In the step (b), (b1) performing a Fourier transform on each voice signal obtained from the multi-channel microphone to obtain a frequency weighting function value, and (b2) using the weighting function value between the voice signals. Calculating a cross-correlation value between voice signals; and (b3) calculating a mutual delay time between the voice signals by finding peak points of the calculated correlation values.

본 발명에서는 지능형 로봇에서 음원추적을 수행하기 위해 3개의 마이크와 음원보드를 통해 로봇의 모든 방향에서 음원추적이 가능하도록 구현함으로써, 높은 추적성공률을 얻을 수 있는 이점이 있다. 또한, 주파수영역에서 GCC-PHAT방법을 이용함으로써 로봇환경에 강인한 음원추적을 수행할 수 있을 뿐만 아니라, 기존의 경 험적인 구간선택 방법 없이 직접적으로 신뢰성 있는 추적각도를 추정함으로써 높은 추적각도의 정확성을 취득할 수가 있으며, 근거리뿐만 아니라 원거리(5m이내)에서 로봇을 호출함으로써 로봇과 자연스럽게 상호작용을 수행할 수 있도록 하는 이점이 있다.In the present invention, by implementing the sound source tracking in all directions of the robot through the three microphones and the sound source board to perform the sound source tracking in the intelligent robot, there is an advantage that can obtain a high tracking success rate. In addition, by using the GCC-PHAT method in the frequency domain, not only can we perform sound source tracking robust to the robotic environment, but also estimate the high tracking angle accuracy by directly estimating the reliable tracking angle without the conventional empirical section selection method. It can be acquired, and there is an advantage of allowing the robot to naturally interact with the robot by calling the robot at a distance as well as at a distance (within 5 m).

이하, 첨부된 도면을 참조하여 본 발명의 동작 원리를 상세히 설명한다. 하기에서 본 발명을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. Hereinafter, with reference to the accompanying drawings will be described in detail the operating principle of the present invention. In the following description of the present invention, when it is determined that a detailed description of a known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted. Terms to be described later are terms defined in consideration of functions in the present invention, and may be changed according to intentions or customs of users or operators. Therefore, the definition should be made based on the contents throughout the specification.

본 발명의 구체적인 핵심 기술요지를 살펴보면, 지능형 로봇에서 음원추적을 수행하기 위해 3개의 마이크와 음원보드를 통해 로봇의 모든 방향에서 음원추적이 가능하도록 구현하며, 주파수영역에서 GCC-PHAT방법을 이용함으로써 로봇환경에 강인한 음원추적을 수행할 수 있을 뿐만 아니라, 기존의 경험적인 구간선택 방법 없이 직접적으로 신뢰성 있는 추적각도를 추정하는 기술을 통해 본 발명에서 이루고자 하는 바를 쉽게 달성할 수 있다.Looking at the specific core technical gist of the present invention, in order to perform sound source tracking in an intelligent robot to implement the sound source tracking in all directions of the robot through the three microphones and the sound source board, by using the GCC-PHAT method in the frequency domain In addition to performing sound source tracking robust to the robot environment, it is easy to achieve the present invention through a technique of directly estimating a reliable tracking angle without a conventional empirical section selection method.

도 1은 본 발명의 실시 예에 따른 음원 추적이 가능한 지능형 로봇의 블록 구성을 도시한 것으로, 본 발명의 지능형 로봇(100)은 로봇이 위치한 일정 범위의 영역내 임의의 방향으로부터 발생된 음원을 추적하는 음원 추적부(102)와, 로봇 제어부(104)의 제어에 따라 음원 추적부(102)로부터 추적된 음원 방향으로 로봇의 방향을 회전시키는 로봇 구동부(106)를 포함한다. 1 is a block diagram of an intelligent robot capable of tracking sound sources according to an exemplary embodiment of the present invention, and the intelligent robot 100 of the present invention tracks sound sources generated from any direction within a predetermined range of area in which the robot is located. The sound source tracking unit 102 and the robot driving unit 106 for rotating the direction of the robot in the direction of the sound source tracked from the sound source tracking unit 102 under the control of the robot control unit 104.

이하 위 도 1을 참조하여 지능형 로봇의 음원 추적을 위한 각 구성 요소의 동작을 상세히 설명하기로 한다.Hereinafter, an operation of each component for sound source tracking of the intelligent robot will be described in detail with reference to FIG. 1.

먼저, 음원 추적부(102)는 음성 데이터 취득부(108)와, 지연시간 계산부(110), 추적각도 처리부(112)를 포함한다. 음성 데이터 취득부(108)는 다채널(3채널) 마이크와 음원보드를 통해 각 채널별 얻어지는 음성을 취득한다. 지연시간 계산부(110)는 상기 음성데이터 취득부(108)로부터 취득되는 음성신호로부터 주파수 영역에서의 GCC-PHAT 기반 음원추적 방법을 이용해 각 마이크별 음성신호의 지연시간 값을 계산한다. 추적각도 처리부(112)는 지연시간 계산부(110)를 통해 계산된 마이크 간의 지연시간으로부터 음원의 방향으로 예상되는 여러 개의 후보 추적 각도들을 계산하고 이들로부터 가장 신뢰성 있는 추적각도를 추정한다.First, the sound source tracking unit 102 includes a voice data acquisition unit 108, a delay time calculation unit 110, and a tracking angle processing unit 112. The audio data acquisition unit 108 acquires the audio obtained for each channel through the multichannel (3-channel) microphone and the sound source board. The delay time calculation unit 110 calculates a delay time value of the voice signal for each microphone from the voice signal acquired from the voice data acquisition unit 108 using a GCC-PHAT based sound source tracking method in the frequency domain. The tracking angle processing unit 112 calculates a plurality of candidate tracking angles expected in the direction of the sound source from the delay time between the microphones calculated by the delay time calculating unit 110 and estimates the most reliable tracking angle from them.

로봇 제어부(104)는 추적각도 처리부(112)로부터 추정된 결과적인 추적각도를 로봇을 호출한 호출자에 의해서 음원이 발생한 방향으로 판단하고, 로봇 구동부(106)를 제어하여 음원의 방향으로 로봇이 회전하도록 하여 호출자의 호출에 대해 로봇이 반응하도록 제어하게 된다. 로봇 구동부(106)는 로봇 제어부(104)의 제어에 따라 구동되어 추적각도 처리부(112)로부터 추정된 결과적인 추적각도로 로봇 바퀴 회전 중심축으로부터 로봇(100)을 회전시켜 로봇이 호출자가 위치한 방향을 향하도록 한다.The robot controller 104 determines the resultant tracking angle estimated by the tracking angle processor 112 in the direction in which the sound source is generated by the caller who called the robot, and controls the robot driver 106 to rotate the robot in the direction of the sound source. This allows the robot to respond to the caller's call. The robot driver 106 is driven under the control of the robot controller 104 to rotate the robot 100 from the robot wheel rotational central axis at the resulting tracking angle estimated by the tracking angle processor 112 so that the robot is located in the direction of the caller. Face it.

도 2는 본 발명의 실시 예에 따라 지능형 로봇에서 음원을 추적하는 동작 제어 흐름을 도시한 것이다. 이하 위 도 1 및 도 2를 참조하여 본 발명의 실시 예를 상세히 설명하기로 한다.2 illustrates an operation control flow for tracking a sound source in an intelligent robot according to an embodiment of the present invention. Hereinafter, an embodiment of the present invention will be described in detail with reference to FIGS. 1 and 2.

먼저, 지능형 로봇이 위치한 일정 범위의 영역 내에서 임의의 방향으로부터 예를 들어 호출자가 상기 로봇(100)을 호출하는 등의 원인으로 인해 음성신호가 발생하는 경우, 음원 추적부(102)내 음성 데이터 취득부(108)는 음원보드와 마이크로부터 각 채널별 음성신호를 취득한 후(S200), 상기 취득된 각 채널별 음성신호들에 대해 끝점 검출 알고리즘(Endpoint detection)을 이용하여 상기 시작점 및 끝점을 검출하여 음성을 검출한다(S202). 즉, 일반적으로 로봇호칭은 음성이 사용되기 때문에 검출된 음성신호가 적어도 0.5초 이상 유지되는 음성신호만을 받아들이고, 그 이하의 음성신호는 잡음으로 간주함으로써, 호출자의 음성신호를 검출하는 것이다.First, when a voice signal is generated from a certain direction within a range of areas where an intelligent robot is located, for example, by a caller calling the robot 100, the voice data in the sound source tracking unit 102 is used. The acquisition unit 108 acquires a voice signal for each channel from the sound source board and the microphone (S200), and then detects the start point and the end point by using an endpoint detection algorithm for the acquired voice signals for each channel. The voice is detected (S202). That is, in general, the robot name is used to detect a voice signal of the caller by accepting only a voice signal in which the detected voice signal is maintained for at least 0.5 seconds or more, and the voice signal below it is noise, since voice is used.

위와 같이 음성 데이터 취득부(108)로부터 음성이 검출되는 경우 지연시간 계산부(110)는 음성 데이터 취득부(108)내 3개의 마이크로부터 얻어진 음성신호 각각에 대해 푸리에 변환(fourier transform)을 수행시킨 후(S204), 1번과 2번 마이크, 2번과 3번 마이크, 1번과 3번 마이크 간 일반화된 상호 상관관계(generalized cross-correlation : GCC)를 이용하여 피크점(peak point)을 찾아 지연시간 값을 구한다(S206). 이때 지연시간 값이 양수인 경우에는 기준 마이크보다 먼저 신호가 도달한 것이고, 음수인 경우에는 기준 마이크가 먼저 신호가 도달함을 의미한다.When the voice is detected from the voice data acquisition unit 108 as described above, the delay time calculation unit 110 performs a Fourier transform on each of the voice signals obtained from the three microphones in the voice data acquisition unit 108. After (S204), the peak point is found by using generalized cross-correlation (GCC) between microphones 1 and 2, microphones 2 and 3, and microphones 1 and 3. The delay time value is obtained (S206). In this case, if the delay time value is positive, the signal arrives before the reference microphone, and if negative, the reference microphone arrives first.

이어, 추적각도 처리부(112)는, 지연시간 계산부(110)로부터 계산된 상기 마이크 간 지연시간 값들을 이용하여 기존의 경험적인 구간선택 방법 없이 직접적으로 기하학적인 방법에 근거하여 후보 되는 여러 개의 추적각도를 얻어낸 후(S208), 이들 후보 추적각도들 중 음원의 방향에 가장 근사한 것으로 계산되는 두 개의 추적 각도를 구하고 이들의 평균을 취함으로써 결과적인 추적각도를 추정한다(S210).Subsequently, the tracking angle processor 112 uses the inter-microphone delay values calculated from the delay time calculator 110 to track a plurality of candidates directly based on a geometric method without a conventional empirical section selection method. After the angle is obtained (S208), two tracking angles which are calculated to be the closest to the direction of the sound source among these candidate tracking angles are obtained, and the resulting tracking angles are estimated (S210).

이어, 위와 같이 추적각도 처리부(112)로부터 음원의 방향이 추적되는 경우, 로봇 구동부(106)는 로봇 제어부(104)의 제어에 따라 추적각도 처리부(112)로부터 추정된 결과적인 추적각도로 로봇 바퀴 회전 중심축에 각도를 조정하고(S212), 상기 조정된 각도로부터 로봇 바퀴 회전 중심축을 회전시켜 로봇(100)이 호출자가 위치한 방향을 향하도록 한다(S214).Subsequently, when the direction of the sound source is tracked from the tracking angle processing unit 112 as described above, the robot driving unit 106 moves the robot wheel at the resulting tracking angle estimated from the tracking angle processing unit 112 under the control of the robot control unit 104. The angle is adjusted to the rotation center axis (S212), and the robot wheel rotation center axis is rotated from the adjusted angle so that the robot 100 faces the direction in which the caller is located (S214).

이하, 상기 도 2에서 음성신호간 지연시간 및 지연시간으로부터 추적각도를 산출하는 방법에 대해 해당 수학식을 이용하여 보다 상세히 살펴보기로 한다.Hereinafter, a method of calculating the tracking angle from the delay time and the delay time between voice signals will be described in detail with reference to FIG. 2.

하기의 [수학식1]은 두 개의 마이크에서 얻어진 음성신호

과

사이의 일반화된 상호 상관관계(R_x1x2(n))를 나타낸다.

는 주파수 가중치 함수로써

의 역수이며 이 가중치 함수를 PHAT(Phase Transform)이라고 한다.Equation 1 below is a voice signal obtained from two microphones

and

Generalized cross-correlation between R _x1x2 (n).

Is a frequency weight function

This weight function is called PHAT (Phase Transform).

여기서,

: 음성신호(

)의 주파수 영역값이며,

: 음성신호(

)의 주파수 영역의 공액복소수값이다.here,

: Voice signal (

) Is the frequency domain of

: Voice signal (

Is the conjugate complex value in the frequency domain of

이때 상기 PHAT은 시간지연을 추정함에 있어서 각 주파수의 상대적인 중요성을 결정하는 주파수에 종속된 가중치 된 함수이며, 하기의 [수학식 2]와 같이 표현되어진다.In this case, the PHAT is a weighted function depending on the frequency that determines the relative importance of each frequency in estimating the time delay, and is expressed as Equation 2 below.

이에 따라 지연 시간(τ)은 하기의 [수학식 3]과 같이 피크점을 갖는 값을 이용해 구해진다.Accordingly, the delay time tau is obtained by using a value having a peak point as shown in Equation 3 below.

이어, 위와 같이 GCC-PHAT 방법에 의해 지연시간이 구해지면, 이 지연시간값을 이용하여 음원의 방향에 대한 추적각도를 구하게 된다. 추적각도는 위 GCC-PHAT 방법에 의해 얻어진 지연시간 값으로부터 예비각도를 계산하여 구해지게 된다.Subsequently, when the delay time is obtained by the GCC-PHAT method as described above, the tracking angle with respect to the direction of the sound source is obtained using the delay time value. The tracking angle is obtained by calculating the preliminary angle from the delay time value obtained by the GCC-PHAT method.

즉, 음원의 방향이 로봇에 구비되는 3개의 마이크에 대해 도 3에서와 같이 위치되는 것으로 가정하는 경우, 상기 지연시간 값(τ)을 이용한 음원의 예비각도는 아래의 [수학식 4]에서와 같이 계산된다.That is, when it is assumed that the direction of the sound source is located as shown in Figure 3 with respect to the three microphones provided in the robot, the preliminary angle of the sound source using the delay time value τ is shown in Equation 4 below. Calculated as

여기서

는 마이크간의 거리,

는 음속도이며,

는 채널 1과 채널2간의 지연시간이다. here

Is the distance between the microphones,

Is the speed of sound,

Is the delay between channel 1 and channel 2.

위와 같이 얻어진 예비각도로부터 계산된 6개의 후보 추적각도들은 아래의 [수학식 5]에서와 같이 계산되어질 수 있다. The six candidate tracking angles calculated from the preliminary angles obtained as described above may be calculated as shown in Equation 5 below.

위와 같이 음원 방향에 대한 6개의 후보 추적각도가 구해진 경우, 음원의 방향으로 추정되는 최종적인 추적 각도는 이들 6개각도 중 가장 적은 오차를 보이는 두 개의 각도를 얻어 평균값을 계산하게 된다. 이렇게 함으로써 잘못 추적된 각도 값을 얻을지라도 신뢰할 수 있는 추적각도를 얻을 수 있게 된다.When six candidate tracking angles for the sound source direction are obtained as described above, the final tracking angle estimated in the direction of the sound source is calculated by obtaining two angles having the least error among these six angles. This allows a reliable tracking angle to be obtained even if an incorrectly tracked angle value is obtained.

한편 상술한 본 발명의 설명에서는 구체적인 실시 예에 관해 설명하였으나, 여러 가지 변형이 본 발명의 범위에서 벗어나지 않고 실시될 수 있다. 따라서 발명의 범위는 설명된 실시 예에 의하여 정할 것이 아니고 특허청구범위에 의해 정하여져야 한다.Meanwhile, in the above description of the present invention, specific embodiments have been described, but various modifications may be made without departing from the scope of the present invention. Therefore, the scope of the invention should be determined by the claims rather than by the described embodiments.

도 1은 본 발명의 실시 예에 따른 음원 추적이 가능한 지능형 로봇의 블록 구성도,1 is a block diagram of an intelligent robot capable of tracking sound sources according to an embodiment of the present invention;

도 2는 본 발명의 실시 예에 따른 지능형 로봇에서 음원을 추적하는 동작 제어 흐름도,2 is an operation control flowchart for tracking a sound source in an intelligent robot according to an embodiment of the present invention;

도 3은 본 발명의 실시 예에 따른 3개의 마이크를 가지는 지능형 로봇에서 음원의 추적각도를 산출하는 개념도.3 is a conceptual diagram for calculating the tracking angle of the sound source in an intelligent robot having three microphones according to an embodiment of the present invention.

<도면의 주요 부호에 대한 간략한 설명><Brief description of the major symbols in the drawings>

108 : 음성데이터 취득부 110 : 지연시간 계산부108: voice data acquisition unit 110: delay time calculation unit

112 : 추적각도 처리부 106 : 로봇 구동부112: tracking angle processing unit 106: robot drive unit

Claims

(a) receiving a voice signal generated from an arbitrary directional sound source through a multi-channel microphone and a sound source board,

(b) calculating a delay time of each voice signal based on a correlation in a frequency domain with respect to the voice signals received by the multichannel microphone;

(c) calculating a direction angle of the sound source by using delay time information between the voice signals;

(d) rotating the robot to face the direction angle of the sound source;

Including,

Step (b), (b1) performing a Fourier transform on each voice signal obtained from the multi-channel microphone to obtain a frequency weighting function value;

(b2) calculating a cross correlation value between the voice signals using a weight function value between the voice signals;

(b3) calculating a mutual delay time between the voice signals by finding peak points of the calculated correlation values;

Sound source tracking method in an intelligent robot comprising a.

The method of claim 1,

In step (a),

(a1) acquiring an audio signal through a multi-channel microphone and a sound source board;

(a2) detecting a voice signal by a caller from the obtained audio signal,

Sound source tracking method in an intelligent robot comprising a.

The method of claim 2,

In the step (a2), the sound source tracking method of the intelligent robot, characterized in that for detecting only the audio signal lasting more than a predetermined reference time of the audio signal as the caller's voice signal.

delete

The method of claim 1,

The weight function value (

),

Two voice signals obtained from the microphone (

,

)from

As in the equation below

[Equation]

: Voice signal (

Frequency domain value of)

: Voice signal (

Conjugate complex value in the frequency domain of

Sound source tracking method in the intelligent robot, characterized in that calculated.

The method of claim 5,

The cross correlation value (R _x1x2 (n)) between the two voice signals is

As in the equation below

[Equation]

Weighting function

: Voice signal (

Frequency domain value of)

: Voice signal (

Conjugate complex value in the frequency domain of

The method of claim 6,

The delay time τ between the two voice signals is

As in the equation below

[Equation]

R _x1x2 (n): Two voice signals (

,

Cross-correlation

The method of claim 1,

Step (c) is,

(c1) calculating a plurality of tracking angles estimated in a direction of a sound source from delay times of the voice signals;

(c2) selecting the two closest tracking angles from the plurality of tracking angles;

(c3) calculating an average of the two tracking angles and calculating the angles of the sound sources;

Sound source tracking method in an intelligent robot comprising a.

The method of claim 8,

Step (c1),

In the case of three microphones, the preliminary angles α, β, and γ formed by three straight lines connecting the microphones and the direction of the sound source are calculated using the delay time τ, and then the Sound source tracking method in an intelligent robot, characterized in that for calculating the number of tracking angles that the sound source makes with the robot.

The method of claim 9,

The preliminary angles (α, β, γ),

As in the equation below

[Equation]

Is the distance between the microphones,

Is the speed of sound,

Is the delay between microphone 1 and microphone 2

The method of claim 1,

In step (d),

(d1) adjusting an angle to the central axis of the robot wheel by the direction angle of the sound source;

(d2) rotating the robot wheel rotational axis from the adjusted angle so that the robot faces the generated direction of the sound source;

Sound source tracking method in an intelligent robot comprising a.

A sound source tracking unit for calculating a time delay value of a voice signal between multi-channel microphones when a voice signal is generated from an arbitrary direction in a predetermined area where the robot is located, and tracking the direction angle of the sound source where the voice signal is generated;

A robot driver for rotating the robot at an angle tracked from the sound source tracking unit;

And a robot controller for controlling the robot to rotate in the direction of the sound source by applying the direction angle of the sound source calculated by the sound source tracking unit to the robot driver.

The sound source tracking unit includes: a voice data acquisition unit for acquiring a voice signal generated from the sound source through the multi-channel microphone and the sound source board;

A delay time calculator for calculating a delay time value of the voice signal between the multi-channel microphones based on a correlation between the acquired voice signals in a frequency domain;

And a tracking angle processor configured to estimate a direction angle of the sound source based on a geometric method using the voice signal delay time between microphones.

The delay time calculating unit performs Fourier transform on each of the speech signals obtained from the multi-channel microphone to obtain a weight function value, and then calculates the cross-correlation value between the speech signals and calculates a delay time between the speech signals. Intelligent robot capable of sound tracking.

delete

The method of claim 12,

The audio data acquisition unit,

An intelligent robot capable of sound tracking, which detects an audio signal lasting for a predetermined reference time among audio signals received from the multi-channel microphone and the sound source board as a caller's voice signal.

delete

The method of claim 12,

The tracking angle processing unit,

After calculating the plurality of tracking angles estimated in the direction of the sound source from the delay time of the voice signals, selecting the two tracking angles closest to the direction of the sound source, the average of the two tracking angles to obtain the direction angle of the sound source Intelligent robot capable of tracking sound sources, characterized in that calculating.

The method of claim 16,

The tracking angle processing unit,

In the case of three microphones, three preliminary angles (α, β, and γ) formed by three straight lines connecting the microphones and the direction of the sound source are obtained by using the delay time (τ) between voice signals, and then the preliminary angles. And a plurality of tracking angles that the sound source makes with the robot.