KR102497914B1

KR102497914B1 - Control method of intelligent for sound-position tracking and system thereof

Info

Publication number: KR102497914B1
Application number: KR1020210141498A
Authority: KR
Inventors: 조용현; 안승민; 이현수; 이규표; 이도윤; 이재승
Original assignee: 조용현; 안승민; 이현수; 이규표; 이도윤; 이재승
Priority date: 2021-10-22
Filing date: 2021-10-22
Publication date: 2023-02-13

Abstract

The present invention relates to an intelligent control method and system for tracking sound locations, which can derive an angle of arrival through machine learning by using a neural network algorithm capable of enabling analysis regardless of both continuous and categorical variables, combining non-linear combinations between input variables, and having excellent performance depending on the number of data, can use an artificial neural network algorithm that can increase performance through the number of learning data, to be actually applied to personal mobility, thereby efficiently extracting non-linear feature data according to various variables, and can minimize the number of nodes and simplify learning calculation processes to overcome problems that as there are many nodes, training takes a long time and a high-performance computer is required, thereby efficiently tracking sound locations. The intelligent control method comprises: a hidden layer step of performing machine learning on an artificial neural network by using data; and an output layer step of classifying values derived in the machine learning, predicting angles of arrival, and informing of the location of the next sound.

Description

Sound position tracking intelligent control method and system {Control method of intelligent for sound-position tracking and system}

본 발명은 소리위치 추적 지능형 제어방법 및 시스템에 관한 것으로서, 더욱 상세하게는 소리의 제약없이 청각 장애인이 퍼스널 모빌리티의 안전한 이용을 위해 청각 장애인들의 불편함을 개선할 수 있게 다양한 임베디드 시스템의 도입을 가능하도록 하고, 이를 통해 퍼스널 모빌리티를 이용하는 청각 장애인에게 소리가 발생하는 위치의 도래각(Direction of Arrival, DOA)을 시각화하는 알고리즘을 제공하도록 하며, 도래각 도출 알고리즘에 대한 계산을 단순화하여 메모리를 높이는 동시에 하드웨어적으로 무게, 공간 및 비용을 최소화시켜 전체적으로 지능형 제어 시스템에 대한 효율성을 극대화시키도록 하는 소리위치 추적 지능형 제어방법 및 시스템에 관한 것이다.The present invention relates to an intelligent control method and system for tracking sound location, and more particularly, it is possible to introduce various embedded systems to improve the inconvenience of hearing-impaired people for safe use of personal mobility without sound restrictions. Through this, to provide an algorithm for visualizing the Direction of Arrival (DOA) of the location where the sound is generated to the hearing impaired using personal mobility, and to simplify the calculation for the angle of arrival algorithm to increase memory while increasing It relates to a sound location tracking intelligent control method and system that maximizes the efficiency of an intelligent control system as a whole by minimizing weight, space and cost in terms of hardware.

최근 퍼스널 모빌리티 시장이 급성장하여 안전을 위한 다양한 기술들이 개발되어지고 있고, 다양한 교통수단이 혼재된 환경에서의 안전문화 정착 등과 같은 제도와 규제, 인프라 구축 등에 초점이 맞춰지고 있으며, 주행 안전 확보를 위한 개발이 필요되고 있어 이동 약자의 경우 전동휠체어와 같은 이동 약자용 보조운송기구로서 퍼스널 모빌리티 활용이 점차 확대되고 있으나 시각 및 청각 장애인등 다양한 인지 장애를 가진 사용자의 편의성을 위해선 관련 제어 기술의 확보가 시급한 실정이다.Recently, with the rapid growth of the personal mobility market, various technologies for safety are being developed, and the focus is on systems, regulations, and infrastructure establishment, such as the establishment of a safety culture in an environment where various transportation means are mixed. Development is needed, so the use of personal mobility is gradually expanding as an auxiliary transportation device for the mobility impaired, such as electric wheelchairs for the mobility impaired. The situation is.

특히, 청각 장애인의 경우 다양한 개인 이동수단이 공존하는 도로에서 시각 정보에 의존하여 소리로 인한 상황들을 인지하지 못하는 환경에 노출되어 있기에, 이를 개선하기 위한 소리 위치를 알려주는 기술의 개발이 더욱 필요한 실정이다.In particular, in the case of the hearing impaired, they are exposed to an environment in which they cannot recognize situations caused by sound by relying on visual information on roads where various personal means of transportation coexist. am.

이를 위해, 종래에는 단일 심화 신경망을 이용하여 음원 방향을 추정하였으나, 단일 심화 신경망을 사용하여 다양한 잔향 환경에서 음원 방향을 추정할 경우 각각 잔향 환경에 적합한 음원 방향 추정 모델을 선택하지 못하여 정교한 음원 방향을 추정할 수가 없는 문제가 있는 것이고, 이는 기존의 단일 심화 신경망 기반의 음원 방향 추정 기술의 경우, 다양한 잔향 환경의 데이터들에 대한 음원 방향을 추정할 때에 실생활에서 존재하는 다양한 잔향 환경에 대한 충분한 정보를 가지고 있지 못하기 때문에 해당 잔향에 적합한 음원 방향 추정 모델을 제시하지 못하는 문제가 있고, 이러한 문제는 여러 잔향 환경이 존재하는 실제 실생활에서 음원 방향 추정의 정확도를 떨어뜨릴 수가 있는 것이다.To this end, conventionally, the direction of a sound source was estimated using a single deep neural network, but when the direction of a sound source is estimated in various reverberation environments using a single deep neural network, a sound source direction estimation model suitable for each reverberation environment cannot be selected, resulting in a sophisticated direction of the sound source. There is a problem that cannot be estimated, and this is because in the case of the existing single deep neural network-based sound source direction estimation technology, when estimating the sound source direction for data of various reverberation environments, sufficient information on various reverberation environments that exist in real life is required. Since it does not have it, there is a problem of not being able to present a sound source direction estimation model suitable for the corresponding reverberation, and this problem can reduce the accuracy of sound source direction estimation in real life where many reverberation environments exist.

이에, 퍼스널 모빌리티에 접목돼야할 시스템 특성상 바람에 따른 음속의 큰 변화와 매질의 경로를 막는 하드웨어 구조에 따른 비선형적인 학습데이터가 도출되며 일반 수학식과 같은 단순 이론적인 도래각 도출이 불가능하게 되는 것이다.Therefore, due to the characteristics of the system to be applied to personal mobility, non-linear learning data is derived according to the hardware structure that blocks the path of the medium and the large change in sound speed according to the wind, and it is impossible to derive a simple theoretical angle of arrival like a general equation.

대한민국 공개특허공보 10-2019-0108711, 공개일자 2019.09.25.Republic of Korea Patent Publication No. 10-2019-0108711, published on 2019.09.25. 대한민국 공개특허공보 10-2011-0011299, 공개일자 2011.02.08.Republic of Korea Patent Publication No. 10-2011-0011299, published on 2011.02.08. 미국 특허공보 2020/0213728, 공고일자 2020.07.02.US Patent Publication 2020/0213728, Publication Date 2020.07.02. 미국 특허공보 2021/0103747, 공고일자 2021.04.08.US Patent Publication 2021/0103747, Publication Date 2021.04.08. 미국 특허공보 2018/0075860, 공고일자 2018.03.15.US Patent Publication 2018/0075860, Publication Date 2018.03.15. 미국 특허공보 2020/0005810, 공고일자 2020.01.02.US Patent Publication 2020/0005810, Publication Date 2020.01.02. 일본 공개특허공보 소64-50980, 공개일자 1989.02.27.Japanese Laid-open Patent Publication No. 64-50980, published on February 27, 1989.

본 발명은 소리의 제약없이 청각 장애인이 퍼스널 모빌리티의 안전한 이용을 위해 청각 장애인들의 불편함을 개선할 수 있는 다양한 임베디드 시스템의 도입을 가능하게 하고, 이를 통해 퍼스널 모빌리티를 이용하는 청각 장애인에게 소리가 발생하는 위치의 도래각(DOA)을 시각화하는 알고리즘을 제공하게 하며, 도래각 도출 알고리즘에 대한 계산을 단순화하여 메모리를 높이는 동시에 하드웨어적으로 무게, 공간 및 비용을 최소화시켜 전체적으로 지능형 제어 시스템에 대한 효율성을 극대화시키게 하는 소리위치 추적 지능형 제어방법 및 시스템을 제공하는데 그 목적이 있다.The present invention enables the introduction of various embedded systems that can improve the inconvenience of hearing-impaired people for the safe use of personal mobility by the hearing-impaired without sound restrictions, and through this, sound is generated for the hearing-impaired using personal mobility It provides an algorithm to visualize the angle of arrival (DOA) of the position, and maximizes the efficiency of the intelligent control system as a whole by minimizing weight, space and cost in terms of hardware while increasing memory by simplifying the calculation for the angle of arrival derivation algorithm. Its purpose is to provide an intelligent control method and system for tracking sound location.

또한, 본 발명의 다른 목적은 연속형 및 범주형 변수에 상관없이 모두 분석이 가능하고 입력 변수들 간의 비선형 조합이 가능하여 데이터 수에 따른 성능이 우수한 신경망 알고리즘을 이용하여 기계학습을 통한 도래각을 도출하게 하는 소리위치 추적 지능형 제어방법 및 시스템을 제공하는데 있다.In addition, another object of the present invention is to determine the angle of arrival through machine learning by using a neural network algorithm that can be analyzed regardless of continuous and categorical variables and has excellent performance according to the number of data because non-linear combination between input variables is possible. It is to provide an intelligent control method and system for tracking sound location.

또한, 본 발명의 다른 목적은 학습 데이터 수를 통해 성능을 높일 수 있는 인공 신경망 알고리즘을 사용하여 퍼스널 모빌리티에 실제 접목되어 여러 가지 변수에 따른 비선형 특징 데이터를 효율적으로 추출할 수 있게 하고, 노드의 수가 복잡할 경우 훈련 시간이 오래 걸리고 고성능의 컴퓨터가 요구되는 문제를 극복하도록 노드의 수를 최소화하고 학습 계산 과정을 단순화하게 함으로 인해 전체적인 소리위치 추적을 효율적으로 이루어질 수 있게 하는 소리위치 추적 지능형 제어방법 및 시스템을 제공하는데 있다.In addition, another object of the present invention is to use an artificial neural network algorithm that can increase performance through the number of learning data, which is actually applied to personal mobility to efficiently extract nonlinear feature data according to various variables, and the number of nodes Sound localization tracking intelligent control method that enables efficient overall sound localization tracking by minimizing the number of nodes and simplifying the learning calculation process to overcome the problem of requiring a long training time and high-performance computer when complex, and to provide the system.

본 발명은 상술한 기술적 과제를 달성하기 위해 소리의 위치를 추적하여 지능적으로 제어하도록 하는 방법으로서, 4채널의 마이크로폰 신호를 받은 다음 샘플링 데이터를 통해 4채널의 2쌍씩 총 6개의 도착 지연 시간 추정치를 추출하도록 하는 일반화 상호상관 단계와, 상기 일반화 상호상관을 통해 추출된 도착 지연 시간 추정치를 일반화하기 위해 1과 0 사이의 값으로 이루도록 하는 정규화 단계와, 상기 정규화 단계를 통해 도출되는 특징값을 머신러닝의 입력값으로 이루도록 하는 입력층 단계와, 상기 입력층 단계를 통해 실시되는 입력된 입력값인 데이터에 의해 인공 신경망의 머신러닝 학습을 진행하는 은닉층 단계와, 상기 은닉층 단계를 통해 인공 신경망의 머신러닝 학습을 진행하는 과정에서 도출되는 값들을 360도에서 20도씩 단위로 구분하여 18개의 값의 노드로 도래각을 예측한 다음 소리 위치를 알려주도록 하는 출력층 단계를 포함한다.The present invention is a method for intelligently controlling sound by tracking the position of sound in order to achieve the above-described technical problem. After receiving a microphone signal of 4 channels, a total of 6 arrival delay estimates of 2 pairs of 4 channels are obtained through sampling data. A generalization cross-correlation step for extracting, a normalization step for generalizing the arrival delay estimate extracted through the generalization cross-correlation to a value between 1 and 0, and machine learning for the feature values derived through the normalization step An input layer step to achieve an input value of , a hidden layer step to proceed with machine learning learning of an artificial neural network based on data that is an input value performed through the input layer step, and machine learning of an artificial neural network through the hidden layer step An output layer step of dividing the values derived in the process of learning into units of 20 degrees from 360 degrees, predicting the angle of arrival with nodes of 18 values, and then informing the location of the sound.

이에, 상기 은닉층 단계에서, 상기 입력층 단계를 통해 입력된 입력값인 데이터에 의해 인공 신경망의 머신러닝 학습을 진행하도록 6개의 값을 512개의 가중치로 해석할 수 있게 ReLU함수를 통해 이루어진다.Therefore, in the hidden layer step, the ReLU function is used to interpret the 6 values as 512 weights so that the machine learning of the artificial neural network is performed based on the input value data input through the input layer step.

또한, 상기 출력층 단계에서, 상기 은닉층 단계를 통해 도출되는 값들을 360도에서 20도씩 단위로 구분하여 18개의 값의 노드로 도래각을 예측할 수 있도록 Softmax함수를 통해 이루어진다.In addition, in the output layer step, the values derived through the hidden layer step are divided into units of 20 degrees from 360 degrees, and the angle of arrival can be predicted with 18 nodes using a Softmax function.

본 발명의 실시 예에 따르면, 소리의 위치를 추적하여 지능적으로 제어하도록 하는 시스템으로서, 상기 소리의 위치를 추적하여 도래각으로 시각화하는 극좌표계로 나타내는 디스플레이부와, 상기 도래각을 도출하도록 인공 신경망 라이브러리 기능을 갖고 각 구성부들의 기능을 제어하기 위한 제어모듈과, 상기 제어모듈에 의해 제어되고 소리의 위치를 추적할 수 있도록 하는 리스피커 마이크 어레이부와, 상기 구성부들의 작동을 이룰 수 있도록 전원을 공급하는 배터리로 이루어진다.According to an embodiment of the present invention, a system for tracking and intelligently controlling the position of sound includes a display unit that tracks the position of sound and visualizes it as an angle of arrival in polar coordinates, and an artificial neural network library for deriving the angle of arrival. A control module for controlling the function of each component having a function, a respeaker microphone array unit controlled by the control module and enabling tracking of the position of sound, and a power supply to achieve the operation of the components It is made up of a battery that supplies

이에, 상기 제어모듈, 리스피커 마이크 어레이부 및 배터리를 안정적으로 장착하고 퍼스널 모빌리티에 채결될 수 있도록 일정한 공간을 갖는 함체가 구비되고, 상기 함체의 일측 상단부에는 도래각을 시각화한 극좌표계를 이용자가 용이하게 확인 가능한 디스플레이부가 구비되어 이루어진다.Accordingly, an enclosure having a certain space is provided so that the control module, respeaker microphone array unit, and battery can be stably mounted and connected to personal mobility. A display unit that can be easily checked is provided.

본 발명의 실시 예에 따르면, 상기 제어모듈에는 해커의 외부요인으로 악의적인 프로그램 변질을 하였을 때, 변질된 프로그램의 실행을 거부하고 공장출하 상태의 프로그램으로 자동복구하도록 하는 보안안전부가 더 구비되어 이루어진다.According to an embodiment of the present invention, the control module is further provided with a security safety unit that rejects the execution of the altered program and automatically restores it to a factory-shipped program when a malicious program is altered due to an external factor by a hacker. .

본 발명의 실시 예에 따르면, 상기 제어모듈에는 5G, PLC통신, LTE, RS-485, 와이파이 또는 블루투스 중 어느 하나를 선택한 통신망을 이용할 수가 있고, 상기 제어모듈과 관리서버 간의 상기 통신망을 이용하여 연동할 수 있도록 통신부가 더 구비되어 이루어진다.According to an embodiment of the present invention, the control module can use a communication network selected from among 5G, PLC communication, LTE, RS-485, Wi-Fi or Bluetooth, and interlocks using the communication network between the control module and the management server. A communication unit is further provided so as to be able to do so.

본 발명의 또 다른 실시 예에 따르면, 컴퓨터 프로그램을 저장하고 있는 컴퓨터 판독 가능 기록매체로서, 상기 컴퓨터 프로그램은, 프로세서에 의해 실행되면, 4채널의 마이크로폰 신호를 받은 다음 샘플링 데이터를 통해 4채널의 2쌍씩 총 6개의 도착 지연 시간 추정치를 추출하도록 하는 일반화 상호상관 단계와, 상기 일반화 상호상관을 통해 추출된 도착 지연 시간 추정치를 일반화하기 위해 1과 0 사이의 값으로 이루도록 하는 정규화 단계와, 상기 정규화 단계를 통해 도출되는 특징값을 머신러닝의 입력값으로 이루도록 하는 입력층 단계와, 상기 입력층 단계를 통해 실시되는 입력된 입력값인 데이터에 의해 인공 신경망의 머신러닝 학습을 진행하는 은닉층 단계와, 상기 은닉층 단계를 통해 인공 신경망의 머신러닝 학습을 진행하는 과정에서 도출되는 값들을 360도에서 20도씩 단위로 구분하여 18개의 값의 노드로 도래각을 예측한 다음 소리 위치를 알려주도록 하는 출력층 단계를 포함하여 소리위치 추적 지능형 제어방법을 상기 프로세서가 수행하도록 하기 위한 명령어를 포함한다.According to another embodiment of the present invention, as a computer-readable recording medium storing a computer program, the computer program, when executed by a processor, receives a microphone signal of 4 channels, and then converts 2 channels of 4 channels through sampling data. A generalization cross-correlation step of extracting a total of 6 arrival delay time estimates for each pair; a normalization step of making the arrival delay estimate extracted through the generalization cross-correlation to have a value between 1 and 0 in order to generalize; An input layer step of making the feature value derived through the input value of machine learning, and a hidden layer step of proceeding with machine learning learning of the artificial neural network based on the data that is the input value carried out through the input layer step, Including an output layer step that predicts the angle of arrival with 18 nodes by dividing the values derived from the process of machine learning of the artificial neural network through the hidden layer step into 20 degree increments from 360 degrees, and then informs the location of the sound. and instructions for causing the processor to perform the sound location tracking intelligent control method.

본 발명의 실시 예에 따르면, 컴퓨터 판독 가능 기록매체에 저장되어 있는 컴퓨터 프로그램으로서, 상기 컴퓨터 프로그램은, 프로세서에 의해 실행되면, 4채널의 마이크로폰 신호를 받은 다음 샘플링 데이터를 통해 4채널의 2쌍씩 총 6개의 도착 지연 시간 추정치를 추출하도록 하는 일반화 상호상관 단계와, 상기 일반화 상호상관을 통해 추출된 도착 지연 시간 추정치를 일반화하기 위해 1과 0 사이의 값으로 이루도록 하는 정규화 단계와, 상기 정규화 단계를 통해 도출되는 특징값을 머신러닝의 입력값으로 이루도록 하는 입력층 단계와, 상기 입력층 단계를 통해 실시되는 입력된 입력값인 데이터에 의해 인공 신경망의 머신러닝 학습을 진행하는 은닉층 단계와, 상기 은닉층 단계를 통해 인공 신경망의 머신러닝 학습을 진행하는 과정에서 도출되는 값들을 360도에서 20도씩 단위로 구분하여 18개의 값의 노드로 도래각을 예측한 다음 소리 위치를 알려주도록 하는 출력층 단계를 포함하여 소리위치 추적 지능형 제어방법을 상기 프로세서가 수행하도록 하기 위한 명령어를 포함한다.According to an embodiment of the present invention, as a computer program stored in a computer readable recording medium, the computer program, when executed by a processor, receives a microphone signal of 4 channels and then samples 2 pairs of 4 channels for a total of 2 pairs through sampling data. A generalization cross-correlation step of extracting six arrival delay time estimates, a normalization step of making the arrival delay estimate extracted through the generalization cross-correlation a value between 1 and 0 in order to generalize, and a normalization step through the normalization step. An input layer step of making the derived feature value an input value of machine learning, a hidden layer step of proceeding with machine learning learning of an artificial neural network based on data that is an input value implemented through the input layer step, and the hidden layer step The values derived from the machine learning process of the artificial neural network through and instructions for causing the processor to perform an intelligent location tracking control method.

본 발명의 실시 예에 따라, 소리의 제약없이 청각 장애인이 퍼스널 모빌리티의 안전한 이용을 위해 청각 장애인들의 불편함이 개선될 수 있는 다양한 임베디드 시스템의 도입이 가능한 효과와, 이를 통해 퍼스널 모빌리티를 이용하는 청각 장애인에게 소리가 발생하는 위치의 도래각(DOA)을 시각화하는 알고리즘이 제공되는 효과가 있고, 도래각 도출 알고리즘에 대한 계산을 단순화하여 메모리를 높이는 동시에 하드웨어적으로 무게, 공간 및 비용을 최소화할 수 있는 효과가 있다.According to an embodiment of the present invention, the effect of introducing various embedded systems that can improve the inconvenience of the hearing impaired for the safe use of personal mobility by the hearing impaired without sound restrictions, and the hearing impaired using personal mobility through this It has the effect of providing an algorithm that visualizes the angle of arrival (DOA) of the location where the sound is generated, and by simplifying the calculation for the angle of arrival algorithm, increasing memory while minimizing weight, space and cost in terms of hardware. It works.

본 발명의 실시 예에 따라, 연속형 및 범주형 변수에 상관없이 모두 분석이 가능하고 입력 변수들 간의 비선형 조합이 가능하여 데이터 수에 따른 성능이 우수한 신경망 알고리즘을 이용하기에 기계학습을 통한 도래각이 용이하게 도출할 수 있는 효과가 있다.According to the embodiment of the present invention, both continuous and categorical variables can be analyzed, and non-linear combinations between input variables are possible to use a neural network algorithm with excellent performance according to the number of data, so the angle of arrival through machine learning is used. This has an effect that can be easily derived.

본 발명의 실시 예에 따라, 학습 데이터 수를 통해 성능을 높일 수 있는 인공 신경망 알고리즘을 사용하여 퍼스널 모빌리티에 실제 접목되어 여러 가지 변수에 따른 비선형 특징 데이터가 효율적으로 추출할 수 있는 효과와, 노드의 수를 최소화하고 학습 계산 과정을 단순화하게 하여 노드의 수가 복잡할 경우 훈련 시간이 오래 걸리고 고성능의 컴퓨터가 요구되는 문제가 극복되는 효과가 있다.According to an embodiment of the present invention, the effect that nonlinear feature data according to various variables can be efficiently extracted by using an artificial neural network algorithm that can increase performance through the number of learning data is actually applied to personal mobility, and the number of nodes By minimizing the number and simplifying the learning calculation process, when the number of nodes is complex, it takes a long training time and has the effect of overcoming the problem of requiring a high-performance computer.

본 발명의 실시 예에 따라, 전체적으로 퍼스널 모빌리티의 소리위치 시각화 모듈의 방안을 제시함으로 인해 실용성, 비용 및 계산량의 문제점이 효율적으로 개선되는 효과와, 머신러닝 구성에 있어 심층 신경망을 활용하여 정확도에 대한 예측이 가능으로 인해 다양한 분야와 환경에서 활용될 가치를 높이는 동시에 급성장하는 퍼스널 모빌리티 시장에 청각 장애인이 안전하게 이용 가능할 수 있어 시스템에 대한 효율성을 극대화시키게 되는 효과가 있는 것이다.According to an embodiment of the present invention, by presenting a method of sound location visualization module of personal mobility as a whole, the problems of practicality, cost and computation are efficiently improved, and by using deep neural networks in machine learning configuration, for accuracy Predictability increases the value to be used in various fields and environments, and at the same time, the hearing impaired can safely use it in the rapidly growing personal mobility market, which has the effect of maximizing the efficiency of the system.

도 1은 본 발명에 의해 실시되는 소리위치 추적 지능형 제어방법에 따른 어신러닝 알고리즘을 설명하기 위한 실시 과정을 보여주는 흐름도이다.
도 2는 본 발명에 따라 실시되는 소리위치 추적 지능형 제어시스템을 적용하는 제품 본체를 보여주는 구성 예시도이다.
도 3은 본 발명에 따라 실시되는 소리위치 추적 지능형 제어시스템을 설명하기 위해 각 구성부들을 보여주는 구성 예시 사진이다.
도 4는 본 발명에 따른 소리위치 추적 지능형 제어시스템을 적용하여 실험하는 환경을 보여주는 예시 사진이다.
도 5는 본 발명에 따른 소리위치 추적 지능형 제어시스템에 대한 제품 사진과 디스플레이부를 통해 도래각의 시각화된 화면을 보여주는 예시도이다.
도 6은 본 발명의 소리위치 추적 지능형 제어시스템을 이용한 실험을 통해 풍속에 따른 소리 위치 정확도를 측정한 결과를 보여주는 그래프이다.1 is a flowchart showing an implementation process for explaining an assimilation algorithm according to an intelligent control method for sound location tracking implemented by the present invention.
Figure 2 is an exemplary configuration diagram showing a product body to which the sound location tracking intelligent control system implemented in accordance with the present invention is applied.
3 is a configuration example photo showing each component to explain the sound location tracking intelligent control system implemented according to the present invention.
4 is an example photo showing an experiment environment by applying the sound location tracking intelligent control system according to the present invention.
5 is an exemplary view showing a visualized screen of the angle of arrival through a product picture and a display unit for the sound location tracking intelligent control system according to the present invention.
6 is a graph showing the results of measuring sound location accuracy according to wind speed through an experiment using the sound location tracking intelligent control system of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.Advantages and features of the present invention, and methods of achieving them, will become clear with reference to the embodiments described below in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments make the disclosure of the present invention complete, and common knowledge in the art to which the present invention belongs. It is provided to completely inform the person who has the scope of the invention, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용되는 용어에 대해 간략히 설명하고, 본 발명에 대해 구체적으로 설명하기로 한다.The terms used in this specification will be briefly described, and the present invention will be described in detail.

본 발명에서 사용되는 용어는 본 발명에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 발명의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 발명에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 발명의 전반에 걸친 내용을 토대로 정의되어야 한다.The terms used in the present invention have been selected from general terms that are currently widely used as much as possible while considering the functions in the present invention, but these may vary depending on the intention of a person skilled in the art or precedent, the emergence of new technologies, and the like. In addition, in a specific case, there is also a term arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the description of the invention. Therefore, the term used in the present invention should be defined based on the meaning of the term and the overall content of the present invention, not simply the name of the term.

명세서 전체에서 어떤 부분이 어떤 구성요소를 '포함'한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미하고, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. In the entire specification, when a part is said to 'include' a certain component, it means that it may further include other components, not excluding other components, unless otherwise stated, and the singular expression Includes plural expressions unless the context clearly indicates otherwise.

아래에서는 첨부한 도면을 참고하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하고, 또한 첨부한 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 하며, 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.Hereinafter, with reference to the accompanying drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily carry out the present invention. In addition, in order to clearly describe the present invention in the drawings, parts irrelevant to the description are omitted, and in the description with reference to the accompanying drawings, the same components are assigned the same reference numerals regardless of reference numerals, and the overlapping Description will be omitted, and in the description of the present invention, if it is determined that a detailed description of a related known technology may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted.

먼저, 본 발명은 소리의 위치를 추적하여 지능적으로 제어하도록 하는 방법에 대한 것으로서, 첨부도면 도 1를 참조하여 설명하기로 한다.First, the present invention relates to a method for intelligently controlling a sound by tracking a position thereof, and will be described with reference to FIG. 1 of the accompanying drawings.

즉, 일반화 상호상관(GCC)을 진행하게 되는데, 이는 4채널의 마이크로폰 신호가 16,000Hz로 들어오면 16,000개의 샘플링 데이터를 통해 4채널의 2쌍씩 총 6개의 도착 지연 시간 추정치를 추출하게 된다(S100).In other words, generalized cross-correlation (GCC) is performed. When a microphone signal of 4 channels comes in at 16,000 Hz, a total of 6 estimations of arrival delay times of 2 pairs of 4 channels are extracted through 16,000 sampling data (S100). .

이때, 상기 일반화 상호상관을 통해 추출된 도착 지연 시간 추정치를 일반화하기 위해 정규화 단계를 거쳐 1과 0 사이의 값으로 데이터를 저장하게 된다(S200).At this time, data is stored as a value between 1 and 0 through a normalization step in order to generalize the estimated arrival delay time extracted through the generalized cross-correlation (S200).

여기서, 상기 정규화 단계를 통해 도출되는 특징값을 머신러닝의 입력값으로 이루도록 하는 입력층 단계를 수행하고(S300), 상기 입력층 단계를 통해 실시되는 입력된 입력값인 데이터에 의해 인공 신경망의 머신러닝 학습을 진행하는 은닉층 단계를 수행하게 된다(S400).Here, an input layer step is performed to make the feature value derived through the normalization step as an input value of machine learning (S300), and the artificial neural network machine The hidden layer step of performing learning learning is performed (S400).

상기 은닉층 단계를 수행한 후에 출력층 단계를 수행하게 되는데, 이는 인공 신경망의 머신러닝 학습을 진행하는 과정에서 도출되는 값들을 360도에서 20도씩 단위로 구분하여 18개의 값의 노드로 도래각을 예측한 다음 소리 위치를 알려주게 되는 것이다(S500).After performing the hidden layer step, the output layer step is performed, which predicts the angle of arrival with 18 value nodes by dividing the values derived in the process of machine learning learning of the artificial neural network into 20 degree units from 360 degrees. The next sound location is informed (S500).

상기와 같이, 훈련 후 실시간으로 신호를 받아드리고 다시 일반화 상호상관, 정규화를 통해서 도출되는 6개의 특징 값은 머신러닝에 들어가는 입력값이 되고, 은닉층의 ReLU함수를 통해서 6개의 값을 512개의 가중치로 해석하여 특정값을 늘리게 되며, 이에 따라 도출되는 값들은 출력층의 Softmax함수를 통해 최종 18개 각도 중 하나의 도래각을 예측한 다음 소리 위치를 알려주게 되는 것이다.As described above, after receiving the signal in real time after training, the six feature values derived through generalization cross-correlation and normalization become input values for machine learning, and the six values are converted into 512 weights through the ReLU function of the hidden layer. After analyzing, the specific value is increased, and the values derived accordingly inform the location of the sound after predicting the angle of arrival of one of the final 18 angles through the Softmax function of the output layer.

그리고, 본 발명에 따라 실시되는 학습 데이터는 일반화 상호상관(GCC)의 피크 값만 추출하며 2쌍식 총 6개의 입력 데이터를 사용하여 파라미터의 수를 줄여주었고, 출력층은 20도 범위의 도래각을 나타냄으로 노드 수를 감소시키며 정확도를 개선한 것이다.In addition, the learning data implemented according to the present invention extracts only the peak value of generalized cross-correlation (GCC) and reduces the number of parameters by using a total of 6 input data of 2 pairs, and the output layer represents the angle of arrival in the range of 20 degrees. It reduces the number of nodes and improves accuracy.

이에 따른, 본 발명은 소리 위치 추적의 경우 대부분 다채널 마이크로폰의 직선거리를 측정하고 도착 지연 시간(Time Direction of Arrival, TDOA)을 구하여 위치를 계산하게 되는 것으로, 이는 하기 수학식들을 통해 도래각과 도착 지연 시간의 관계식과 일반화 상호상관(Generalized Cross-Correlation Phase Transfrom, GCC-PHAT)을 통한 도착 지연을 도출하게 되는 것이다.Accordingly, in the case of sound location tracking, the present invention measures the straight-line distance of most multi-channel microphones and calculates the location by obtaining the Time Direction of Arrival (TDOA), which is the angle of arrival and arrival through the following equations. Arrival delay is derived through the relational expression of delay time and Generalized Cross-Correlation Phase Transform (GCC-PHAT).

[수학식 1][Equation 1]

여기서, 수학식 1은 한점의 음원이 발생하였을 때, 마이크로폰의 도착 지연 시간(

) 은 음속(c), 도래각(

), 두 마이크로폰 사이의 거리(d) 따라 나타나게 된다.Here, Equation 1 is the arrival delay time of the microphone when a single sound source is generated (

) is the speed of sound (c), the angle of arrival (

), it appears according to the distance (d) between the two microphones.

[수학식 2][Equation 2]

=

여기서, 수학식 2에서 확인할 수 있는 바와 같이, 음속(c)는 상온 대기일 경우를 가정하여 343m/s로 정의하였을 때 두 마이크로폰 사이의 거리(d)는 배열에 따른 상수 값이므로, 도착 지연 시간(

)값만 알고 있다면 쉽게 도래각(

)을 구할 수 있는 것이다.Here, as can be seen in Equation 2, when the sound speed (c) is defined as 343 m/s assuming a room temperature atmosphere, the distance (d) between the two microphones is a constant value according to the arrangement, so the arrival delay time (

), the angle of arrival (

) can be obtained.

이에, 상기 도착 지연 시간은 일반화 상호상관(Generalized Cross-Correlation Phase Transfrom, GCC-PHAT)을 이용하여 구하는 방법이 비교적 간단하고 정확하게 이루어짐을 알 수가 있다.Accordingly, it can be seen that a method for obtaining the arrival delay time using Generalized Cross-Correlation Phase Transform (GCC-PHAT) is relatively simple and accurate.

먼저 상호상관(Cross-Correlation, CC)을 통해 도착 지연 시간을 도출한 다음 일반화 상호상관을 구하는 것으로, 한점에 음원이 발생한 두 마이크로폰의 신호 값은 하기 수학식 3과 수학식 4를 통해 나타낸다.First, the arrival delay time is derived through cross-correlation (CC), and then the generalized cross-correlation is obtained.

[수학식 3][Equation 3]

[수학식 4][Equation 4]

여기서, s(t)는 음성 신호, n(t)는 잡음 신호이다. Here, s(t) is a voice signal and n(t) is a noise signal.

구하고자 하는 두 신호의 도착 지연 시간(D)은 두 번째 신호에 대하여 첫 번째 신호의 도착 지연 시간값이며 상호상관은 하기 수학식 5을 통해 알 수가 있다.The arrival delay time (D) of the two signals to be obtained is the arrival delay time value of the first signal with respect to the second signal, and the cross-correlation can be found through Equation 5 below.

[수학식 5][Equation 5]

여기서,

과

값이 같아지는 시점에 피크값을 갖고, 하기 수학식 6을 통해 도착 지연 시간 추정치(

)를 얻을 수 있다.here,

class

It has a peak value at the time when the values are equal, and the arrival delay time estimate through Equation 6 below (

) can be obtained.

[수학식 6][Equation 6]

다음으로, 상호상관에 백색화 가중치를 합성한 일반화 상호상관(GCC)을 구하게 되는데, 이는 저주파수 대역 신호에 따른 피크 값의 범위가 넓어지는 문제점으로 백색화 가중치를 곱하여 특징을 살리며, 스펙트럼 크기에 의해 신호 스펙트럼 밀도를 정규화하는 장점이 있다. Next, generalized cross-correlation (GCC) is obtained by combining whitening weights with cross-correlation. This is a problem in which the range of peak values according to low-frequency signals is widened. It has the advantage of normalizing the signal spectral density.

즉, 상기 수학식 5를 통한 계산량을 줄이기 위해 고속 푸리 변환(Fast Fourier Transform, FFT)을 사용하여 하기 수학식 7에 나타낸 바와 같이 시간함수를 주파수함수로 바꿔주었다.That is, in order to reduce the amount of calculation through Equation 5, a fast Fourier transform (FFT) was used to convert the time function into a frequency function as shown in Equation 7 below.

[수학식 7][Equation 7]

그리고, 역푸리 변환 후 백색화 가중치를 곱한 일반화 상호상관(GCC)은 하기 수학식 8에 나타낸 바와 같다.Then, the generalized cross-correlation (GCC) multiplied by the whitening weight after inverse Fourier transformation is as shown in Equation 8 below.

[수학식 8][Equation 8]

의 최대값을 통해 D의 값이 도착 지연 시간의 추정치가 도출되고, 하기 수학식 9에 대입하여 도래각을 구할 수가 있는 것이다.

Through the maximum value of D, an estimate of the arrival delay time is derived, and the angle of arrival can be obtained by substituting it into Equation 9 below.

[수학식 9][Equation 9]

이에 따른, 본 발명은 인공 신경망 라이브러리 기능을 이용하여 4채널 마이크로폰을 통해 실시간 신호를 받은 후 16KHz 샘플링 데이터(sampling data)를 일반화 상호상관(GCC)을 적용하여 수학식 9의

값을 저장하게 되고, 이는 기계학습 인공 신경망(Artificial Neural Network, ANN)의 학습데이터로 입력되며 알고리즘을 통해 도래각을 나타내고, 도출된 도래각은 디스플레이부를 통해 이용자가 용이하게 확인 가능하도록 극 좌표로 디스플레이되는 것이다.Accordingly, the present invention receives a real-time signal through a 4-channel microphone using an artificial neural network library function and applies generalized cross-correlation (GCC) to 16 KHz sampling data to obtain Equation 9

The value is stored, which is input as learning data of the artificial neural network (ANN) and represents the angle of arrival through the algorithm. The derived angle of arrival is displayed in polar coordinates so that the user can easily check it through the display unit. that will be displayed

이와 같이, 4채널 마이크로폰으로 실시간 신호를 받고, 16KHz 주파수를 통해 신호를 전달 받으며, 음성을 인식하면 4채널에서 전달받은 신호는 16KHz 샘플링 데이터로 각 2쌍의 값을 비교하고 일반화 상호(GCC)함수를 통해 시간 도착 차 추정치를 도출할 수 있는 것이다.In this way, a real-time signal is received with a 4-channel microphone, the signal is received through a 16KHz frequency, and when a voice is recognized, the signal received from the 4-channel is 16KHz sampling data, and each pair of values is compared and the generalized mutual (GCC) function Through this, it is possible to derive an estimate of the difference in time arrival.

첨부도면 도 2 및 도 3을 참조하여 소리의 위치를 추적하여 지능적으로 제어하도록 하는 시스템에 대해 설명하기로 한다.Referring to the accompanying drawings, FIGS. 2 and 3, a description will be given of a system that tracks the location of sound and intelligently controls it.

즉, 본 발명에 따른 시스템에서는 디스플레이부(30)가 구비되어 있는데, 이는 소리의 위치를 추적하여 도래각으로 시각화하는 극좌표계로 나타내어 청각 장애인과 같은 이용자가 용이하게 확인할 수 있도록 하기 위한 것이고, 도래각을 시각화한 극좌표계를 이용자가 용이하게 확인이 가능할 수 있도록 함체(50)의 일측 상단부에는 구비되게 되는 것이다.That is, in the system according to the present invention, the display unit 30 is provided, which tracks the position of the sound and displays it in polar coordinates to visualize the angle of arrival so that users such as hearing impaired can easily check it. The visualized polar coordinate system is provided at the upper end of one side of the enclosure 50 so that the user can easily check it.

상기 함체(50)는 일정한 공간을 갖도록 이우러져 있고, 상기 공간에 제어모듈(10), 리스피커 마이크 어레이부(20) 및 배터리(40)가 장착되어 안정적으로 구비될 수가 있는 것이며, 함체 내에 장착된 구성부들은 하나의 모듈화로 구성시켜 이루어질 수가 있는 것이다.The housing 50 is arranged to have a certain space, and the control module 10, the speaker microphone array unit 20, and the battery 40 are mounted in the space so that it can be stably provided, mounted in the housing The configured components can be made by configuring one module.

이때, 상기 제어모듈(10)은 도래각을 도출하도록 인공 신경망 라이브러리 기능을 갖고 각 구성부들의 기능을 제어하게 되고, 상기 리스피커 마이크 어레이부(20)는 제어모듈(10)에 의해 제어되고 소리의 위치를 추적하게 된다.At this time, the control module 10 has an artificial neural network library function to derive the angle of arrival and controls the functions of each component, and the respeaker microphone array unit 20 is controlled by the control module 10 and produces sound. will track the location of

또한, 상기 함체(50) 내에 구비된 배터리(40)는 상기 구성부들의 작동을 이룰 수 있도록 전원을 공급하게 된다.In addition, the battery 40 provided in the enclosure 50 supplies power to achieve the operation of the components.

여기서, 상기 제어모듈(10)의 인공 신경망 라이브러리는 실시간 제어 지연을 줄이기 위해 심층 신경망 구조의 노드를 줄여 계산 과정을 단순화하하기 위한 것이다.Here, the artificial neural network library of the control module 10 is intended to simplify the calculation process by reducing nodes of the deep neural network structure in order to reduce real-time control delay.

첨부도면 도 4는 본 발명에 따라 실시되는 시스템에 대한 실험을 위해 적용시킨 실험 환경을 보여주는 예시도로서, 이는 팬(Fan)을 통해 운행시 맞바람의 환경을 재현하였고, 지름 0.8m 및 높이 1.5m의 원통형 부품을 사람(Person)으로 묘사하였으며, 퍼스널 모빌리티(PM)에 접목시키며 실제 운행하는 환경에 맞춰 실험을 진행하였다.Figure 4 is an exemplary diagram showing an experimental environment applied for the experiment on the system implemented according to the present invention, which reproduces the environment of the headwind when operating through a fan, diameter 0.8m and height 1.5m. The cylindrical parts of <Person> were described as a person, and the experiment was conducted in line with the actual driving environment by grafting it to personal mobility (PM).

이는 운행할 때 발생되는 맞바람의 풍속 환경을 맞추기 위해 정방 풍속 단계에 따른 정확도의 측정도 진행하였고, 퍼스널 모빌리티(PM)의 최고 속도는 25km/h이기에, 실험의 최대 풍속는 25km/h로 정하였으며, 10km/h씩 간격을 두어 25km/h, 15km/h, 5km/h로 진행하였으며, 성능을 비교하기 위해 풍속이 없는 기본 환경에서의 실험도 진행하였다.In order to match the wind speed environment of the headwind generated when driving, the accuracy was measured according to the wind speed step, and since the maximum speed of Personal Mobility (PM) is 25 km/h, the maximum wind speed of the experiment was set at 25 km/h, It was conducted at 25 km/h, 15 km/h, and 5 km/h at intervals of 10 km/h, and an experiment was also conducted in a basic environment without wind speed to compare performance.

이때, 머신러닝 지도학습을 통해 진행하였고, 데이터는 4 마이크로폰의 신호를 저장하여 학습데이터를 도출하였으며, 경우의 수에 따른 학습 데이터는 3,600세트를 진행하였고, None, 5km/h, 15km/h, 25km/h 총 14,400개 세트의 데이터를 추출하였다.At this time, it was conducted through machine learning supervised learning, and the data was derived by storing the signals of 4 microphones, and the learning data according to the number of cases was 3,600 sets. A total of 14,400 sets of data were extracted at 25 km/h.

이에 따른, 도 1에 도시된 본 발명의 알고리즘 구성 흐름도에서 확인할 수 있는 출력층에서도 볼 수 있듯이 도래각 전체 20도 범위씩 분류하여 18개 방향을 나타낸 것이고, 첨부도면 도 5의 도래각 시각화 화면인 디스플레이부(30)를 통해 청각 장애인과 같은 이용자가 효율적으로 편하게 확인할 수 있도록 150도 경사지게 이루어져 있으며, 디스플레이부(30)의 도래각 시각화 화면에 보여지는 붉은색 선의 굵기를 넓혀주어 도래각을 인지하기 쉽도록 디자인하였다. Accordingly, as can be seen in the output layer, which can be seen in the algorithm configuration flow chart of the present invention shown in FIG. 1, 18 directions are shown by classifying the entire angle of arrival by 20 degree ranges, and the display, which is the visualized screen of the angle of arrival of FIG. It is inclined at 150 degrees so that users such as the hearing impaired can efficiently and comfortably check through the unit 30, and it is easy to recognize the angle of arrival by widening the thickness of the red line shown on the visualizing screen of the angle of arrival of the display unit 30 designed to.

이와 같이, 머신러닝에서 실시간 예측된 도래각은 Python3 plot 함수를 통해 극좌표계로 표시하였고, 만약 소리 인식이 없는 경우 또는 퍼스널 모빌리티의 자체 소리를 인식할 경우에는 'No Sound'라는 문구가 디스플레이부(30)의 화면을 통해 표시하게 되는 것이다.As such, the angle of arrival predicted in real time in machine learning was displayed in polar coordinates through the Python3 plot function. ) will be displayed on the screen.

첨부도면 도 6 및 표 1은 상기 실험을 통해 나타난 정확도의 결과로서, 풍속이 최대 25km/h일 경우 ANN 알고리즘의 테스트 정확도는 70.18%로 도출되었고, 풍속이 커질수록 테스트 정확도는 점차 낮아지고 있는 것을 알 수가 있는 것이다.6 and Table 1 are the results of the accuracy shown through the above experiments, and the test accuracy of the ANN algorithm was derived as 70.18% when the wind speed was up to 25 km/h, and the test accuracy gradually decreased as the wind speed increased. it can be known

NoneNone 5km/h5km/h 15km/h15km/h 25km/h25km/h ANN 테스트 정확도(%)ANN test accuracy (%) 99.5399.53 87.6287.62 79.6279.62 70.1270.12

한편, 상기 제어모듈(10)에는 보안안전부(미도시)가 더 구비될 수 있는데, 이는 해커와 같은 외부요인으로 악의적인 프로그램 변질을 하였을 때, 변질된 프로그램의 실행을 거부하고 공장출하 상태의 프로그램으로 자동복구될 수 있도록 이루어지게 된다.On the other hand, the control module 10 may further include a security safety unit (not shown), which, when a malicious program is altered by an external factor such as a hacker, rejects the execution of the altered program and restores the factory-shipped program This is done so that it can be automatically restored.

상기와 같은 프로그램의 구동은 임베디드 리눅스와 같은 운영체제(Operating System, OS)에서 구현하는 동시에 실시간 운영체제(Realtime Operating System, RT-OS)에서 구현할 수가 있고, 어플리케이션과 미들웨어로 이원화 하여 JAVA 등으로 작성한 공통적인 어플리케이션이 하드웨어의 종류 또는 제조사에 상관없이 호환 작동할 수 있도록 하는 이루어지는 것이다.The operation of the above programs can be implemented in an operating system (OS) such as embedded Linux and at the same time in a real-time operating system (RT-OS), and can be dualized into applications and middleware to create common This is done so that the application can operate compatible regardless of the type or manufacturer of the hardware.

또한, 상기 제어모듈(10)에는 통신부(미도시)가 더 구비되어 있는데, 이는 5G, PLC통신, LTE, RS-485, 와이파이 또는 블루투스 중 어느 하나를 선택한 통신망을 이용할 수가 있고, 상기 제어모듈(10)과 관리서버(미도시) 간의 통신 또한 상기 통신망을 이용하여 연동할 수가 있는 것이다.In addition, the control module 10 is further provided with a communication unit (not shown), which can use a communication network selected from among 5G, PLC communication, LTE, RS-485, Wi-Fi or Bluetooth, and the control module ( Communication between 10) and the management server (not shown) can also be linked using the communication network.

다양한 실시 예에 따르면, 컴퓨터 프로그램을 저장하고 있는 컴퓨터 판독 가능 기록매체로서, 상기 컴퓨터 프로그램은, 프로세서에 의해 실행되면, 4채널의 마이크로폰 신호를 받은 다음 샘플링 데이터를 통해 4채널의 2쌍씩 총 6개의 도착 지연 시간 추정치를 추출하도록 하는 일반화 상호상관 단계와, 상기 일반화 상호상관을 통해 추출된 도착 지연 시간 추정치를 일반화하기 위해 1과 0 사이의 값으로 이루도록 하는 정규화 단계와, 상기 정규화 단계를 통해 도출되는 특징값을 머신러닝의 입력값으로 이루도록 하는 입력층 단계와, 상기 입력층 단계를 통해 실시되는 입력된 입력값인 데이터에 의해 인공 신경망의 머신러닝 학습을 진행하는 은닉층 단계와, 상기 은닉층 단계를 통해 인공 신경망의 머신러닝 학습을 진행하는 과정에서 도출되는 값들을 360도에서 20도씩 단위로 구분하여 18개의 값의 노드로 도래각을 예측한 다음 소리 위치를 알려주도록 하는 출력층 단계를 포함하는 방법을 상기 프로세서가 수행하도록 하기 위한 명령어를 포함할 수 있다.According to various embodiments, as a computer-readable recording medium storing a computer program, the computer program, when executed by a processor, receives a microphone signal of 4 channels and then converts a total of 6 microphone signals into 2 pairs of 4 channels through sampling data. A generalization cross-correlation step for extracting an arrival delay time estimate, a normalization step for generalizing the arrival delay time estimate extracted through the generalization cross-correlation to a value between 1 and 0, and An input layer step of making a feature value an input value of machine learning, a hidden layer step of proceeding with machine learning learning of an artificial neural network based on data that is an input value performed through the input layer step, and a hidden layer step through the hidden layer step The above method includes an output layer step of dividing the values derived in the process of machine learning of the artificial neural network into units of 20 degrees from 360 degrees, predicting the angle of arrival with 18 nodes, and then informing the location of the sound. It may include instructions for causing the processor to execute.

다양한 실시 예에 따르면, 컴퓨터 프로그램을 저장하고 있는 컴퓨터 판독 가능 기록 매체로서, 상기 컴퓨터 프로그램은 프로세서에 의해 실행되면, According to various embodiments, a computer readable recording medium storing a computer program, when the computer program is executed by a processor,

4채널의 마이크로폰 신호를 받은 다음 샘플링 데이터를 통해 4채널의 2쌍씩 총 6개의 도착 지연 시간 추정치를 추출하도록 하는 일반화 상호상관 단계와, 상기 일반화 상호상관을 통해 추출된 도착 지연 시간 추정치를 일반화하기 위해 1과 0 사이의 값으로 이루도록 하는 정규화 단계와, 상기 정규화 단계를 통해 도출되는 특징값을 머신러닝의 입력값으로 이루도록 하는 입력층 단계와, 상기 입력층 단계를 통해 실시되는 입력된 입력값인 데이터에 의해 인공 신경망의 머신러닝 학습을 진행하는 은닉층 단계와, 상기 은닉층 단계를 통해 인공 신경망의 머신러닝 학습을 진행하는 과정에서 도출되는 값들을 360도에서 20도씩 단위로 구분하여 18개의 값의 노드로 도래각을 예측한 다음 소리 위치를 알려주도록 하는 출력층 단계를 포함하는 방법을 상기 프로세서가 수행하도록 하기 위한 명령어를 포함할 수 있다.A generalization cross-correlation step of receiving 4-channel microphone signals and extracting a total of 6 arrival delay estimates of 2 pairs of 4 channels through sampling data, and in order to generalize the arrival delay estimates extracted through the generalization cross-correlation A normalization step to achieve a value between 1 and 0, an input layer step to make the feature value derived through the normalization step as an input value of machine learning, and data that are input values implemented through the input layer step The hidden layer step of performing machine learning learning of the artificial neural network by and the values derived from the process of machine learning learning of the artificial neural network through the hidden layer step are divided into 18 value nodes by dividing them into units of 20 degrees from 360 degrees It may include instructions for causing the processor to perform a method including an output layer step of estimating the angle of arrival and then informing the location of the sound.

본 발명에 첨부된 각 흐름도의 각 단계의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수도 있다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 인스트럭션들이 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 기록매체에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 기록매체에 저장된 인스트럭션들은 흐름도의 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 흐름도의 각 단계에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.Combinations of each step in each flowchart attached to the present invention may be performed by computer program instructions. Since these computer program instructions may be loaded into a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing equipment, the instructions executed by the processor of the computer or other programmable data processing equipment function as described in each step of the flowchart. create a means to do them. These computer program instructions can also be stored on a computer usable or computer readable medium that can be directed to a computer or other programmable data processing equipment to implement functions in a particular way, so that the computer usable or computer readable It is also possible that the instructions stored on the recording medium produce an article of manufacture containing instruction means for performing the functions described in each step of the flowchart. The computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operational steps are performed on the computer or other programmable data processing equipment to create a computer-executed process to generate computer or other programmable data processing equipment. Instructions for performing the processing equipment may also provide steps for executing the functions described at each step in the flowchart.

또한, 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실시예들에서는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 단계들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.Further, each step may represent a module, segment or portion of code that includes one or more executable instructions for executing the specified logical function(s). It should also be noted that in some alternative embodiments it is possible for the functions mentioned in the steps to occur out of order. For example, two steps shown in succession may in fact be performed substantially concurrently, or the steps may sometimes be performed in reverse order depending on the function in question.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 품질에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 균등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an example of the technical idea of the present invention, and various modifications and variations can be made to those skilled in the art without departing from the essential qualities of the present invention. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention, but to explain, and the scope of the technical idea of the present invention is not limited by these embodiments. The protection scope of the present invention should be construed according to the claims below, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.

10 : 제어모듈
20 : 리스피커 마이크 어레이부
30 : 디스플레이부
40 : 배터리
50 : 함체10: control module
20: respeaker microphone array unit
30: display unit
40: battery
50: hull

Claims

As a method of tracking the location of sound and intelligently controlling it,
A generalization cross-correlation step of receiving a microphone signal of 4 channels and extracting a total of 6 arrival delay time estimates of 2 pairs of 4 channels through sampling data;
a normalization step of making the estimated arrival delay time extracted through the generalized cross-correlation a value between 1 and 0 in order to generalize it;
an input layer step of making feature values derived through the normalization step as input values for machine learning;
a hidden layer step of performing machine learning learning of an artificial neural network based on data that is an input value performed through the input layer step;
An output layer step of dividing the values derived in the process of machine learning of the artificial neural network through the hidden layer step into units of 20 degrees from 360 degrees, predicting the angle of arrival with 18 nodes, and then informing the location of the sound;
Sound location tracking intelligent control method comprising a.

According to claim 1,
In the generalized cross-correlation step,
An intelligent control method for tracking sound location through the following equations to extract the arrival delay time estimate.
[Equation 1]

Here, Equation 1 is the arrival delay time of the microphone when a single sound source is generated (

) is the speed of sound (c), the angle of arrival (

), appears according to the distance (d) between the two microphones,
[Equation 2]

=

Here, when the sound speed (c) in Equation 2 is defined as 343 m/s assuming a room temperature atmosphere, the distance (d) between the two microphones is a constant value according to the arrangement,
[Equation 3]

[Equation 4]

Here, s(t) is a voice signal, n(t) is a noise signal,
[Equation 5]

here,

class

It has a peak value at the time when the values are equal,
[Equation 6]

here,

is an estimate of arrival latency,
[Equation 7]

Here, Equation 7 is an equation that converts a time function into a frequency function using a Fast Fourier Transform (FFT) to reduce the amount of calculation through Equation 5,
[Equation 8]

Here, Equation 8 shows the generalized cross-correlation (GCC) multiplied by the whitening weight after inverse Fourier transformation,

Through the maximum value of D, an estimate of the arrival delay is derived,
[Equation 9]

Here, the angle of arrival can be obtained through Equation 9,

According to claim 1,
In the hidden layer step,
An intelligent control method for tracking sound location through the ReLU function so that six values can be interpreted as 512 weights to proceed with machine learning learning of an artificial neural network based on the input value data input through the input layer step.

According to claim 1,
In the output layer step,
An intelligent control method for tracking sound location through a Softmax function so that the angle of arrival can be predicted with 18 nodes by dividing the values derived through the hidden layer step by 360 degrees by 20 degrees.

A system that tracks the location of sound and intelligently controls it,
a display unit that tracks the position of the sound and displays it in polar coordinates to visualize it as an angle of arrival;
a control module having an artificial neural network library function and controlling functions of each component to derive the angle of arrival;
a re-speaker microphone array unit controlled by the control module and capable of tracking the location of sound;
a battery supplying power so that the components can operate;
Sound location tracking intelligent control system consisting of.

According to claim 5,
An enclosure having a certain space is provided so that the control module, respeaker microphone array unit, and battery can be stably mounted and connected to personal mobility, and a polar coordinate system visualizing the angle of arrival is provided at the upper end of one side of the enclosure for the user to easily A sound location tracking intelligent control system having a display unit that can be confirmed.

A computer-readable recording medium storing a computer program,
When the computer program is executed by a processor,
A generalization cross-correlation step of receiving a microphone signal of 4 channels and extracting a total of 6 arrival delay time estimates of 2 pairs of 4 channels through sampling data; a normalization step of making the estimated arrival delay time extracted through the generalized cross-correlation a value between 1 and 0 in order to generalize it; an input layer step of making feature values derived through the normalization step as input values for machine learning; a hidden layer step of performing machine learning learning of an artificial neural network based on data that is an input value performed through the input layer step; An output layer step of dividing the values derived in the process of machine learning of the artificial neural network through the hidden layer step into units of 20 degrees from 360 degrees, predicting the angle of arrival with 18 nodes, and then informing the location of the sound;
Including a command for causing the processor to perform an intelligent control method for tracking sound location,
A computer-readable recording medium.

As a computer program stored on a computer-readable recording medium,
When the computer program is executed by a processor,
A generalization cross-correlation step of receiving a microphone signal of 4 channels and extracting a total of 6 arrival delay time estimates of 2 pairs of 4 channels through sampling data; a normalization step of making the estimated arrival delay time extracted through the generalized cross-correlation a value between 1 and 0 in order to generalize it; an input layer step of making feature values derived through the normalization step as input values for machine learning; a hidden layer step of performing machine learning learning of an artificial neural network based on data that is an input value performed through the input layer step; An output layer step of dividing the values derived in the process of machine learning of the artificial neural network through the hidden layer step into units of 20 degrees from 360 degrees, predicting the angle of arrival with 18 nodes, and then informing the location of the sound;
Including a command for causing the processor to perform an intelligent control method for tracking sound location,
computer program.