KR20210115425A

KR20210115425A - Smart Volume Control System of Voice Information System According to Noise Levels by Real-Time Acoustic Analysis

Info

Publication number: KR20210115425A
Application number: KR1020200031190A
Authority: KR
Inventors: 전주천
Original assignee: 주식회사 센스비전
Priority date: 2020-03-13
Filing date: 2020-03-13
Publication date: 2021-09-27
Also published as: KR102421158B1

Abstract

The present invention relates to a smart volume control system (1). The smart volume control system (1) comprises: an input unit (3) for collecting ambient sound; an extraction unit (5) which analyzes the sound input in real time through the input unit (3) and simultaneously extracts sound and noise data; a voice analysis unit (7) for dividing and analyzing the extracted sound data by time; a learning unit (9) which learns the sound data extracted through the extraction unit (5) through deep learning, extracts and vectorizes a sound pattern, analyzes similarity, and distinguishes whether the analyzed sound pattern is general noise or accident noise; an IOT linkage unit (11) which notifies an external organization by interworking with an IOT device in accordance with the result learned by the learning unit (9); a noise control unit (13) for classifying the sound analyzed by the voice analysis unit (7) in accordance with intensity; and an output unit (17) for performing announcements in accordance with the analysis result of the sound analysis unit (7) and the noise control unit (13). Therefore, the present invention can appropriately control the output volume of a voice guidance system in accordance with the ambient noise level.

Description

Smart Volume Control System of Voice Information System According to Noise Levels by Real-Time Acoustic Analysis

본 발명은 스마트 볼륨 조절 시스템에 관한 것으로서, 보다 상세하게는 실시간으로 수집되는 음원으로부터 사람, 차량 등이 내는 소리를 추출하고, 추출된 음향 데이터를 분석함으로써 소음도를 시간별, 일별, 주간 및 야간별로 구분하고, 딥러닝에 의하여 음향 패턴을 분류함으로써 음성 안내시스템의 출력 볼륨을 주위 소음도에 따라 적절하게 조정할 수 있는 기술에 관한 것이다.The present invention relates to a smart volume control system, and more particularly, by extracting sounds made by people, vehicles, etc. from sound sources collected in real time, and analyzing the extracted sound data to classify the noise level by hour, day, daytime and nighttime And it relates to a technology that can properly adjust the output volume of the voice guidance system according to the ambient noise level by classifying sound patterns by deep learning.

최근 IoT 기술을 적용한 스마트 홈 및 스마트 시티 서비스의 발전에 따라 버스 정류장, 횡단보도나 도로에서 버스의 도착 시간 안내, 보행안전 시스템의 음성안내 및 미세먼지 저감을 위한 음성 안내 장치 같은 것이 많이 설치되고 있다.Recently, with the development of smart home and smart city services to which IoT technology is applied, a lot of things such as bus arrival time guidance at bus stops, crosswalks or roads, voice guidance of pedestrian safety systems, and voice guidance devices for reducing fine dust are being installed. .

그러나, 이러한 종래의 음성 안내장치는 다음과 같은 문제점이 있다.However, such a conventional voice guidance device has the following problems.

첫째, 항상 일정한 소리로 출력되는 일률적인 장치로 인하여, 통행량이 많은 시간에는 음성안내가 들리지 않고, 조용한 시간대에는 너무 크게 들리는 음성안내로 인하여 불편함이 있다.First, due to a uniform device that is always output with a constant sound, there is an inconvenience due to the voice guidance being too loud during the time of heavy traffic and not being heard during the time of heavy traffic.

둘째, 소음이 일반 소음인지 사고로 인한 소음인지를 구분할 수 없어서 상황에 적절한 대응을 하기 어려운 문제점이 있다.Second, there is a problem in that it is difficult to properly respond to the situation because it is not possible to distinguish whether the noise is general noise or noise caused by an accident.

특허출원 제10-2019-0098376호(명칭:지능적 음성 출력방법, 음성출력장치 및 지능형 컴퓨팅 디바이스)Patent Application No. 10-2019-0098376 (Title: Intelligent voice output method, voice output device and intelligent computing device)

따라서, 본 발명은 이와 같은 문제점을 해결하기 위하여 안출된 것으로서, 본 발명의 목적은, 안내 방송시 주위 소음에 따라 가변적으로 방송함으로써 전달하고자 하는 메시지를 정확하게 전달할 수 있는 시스템을 제공하는 것이다.Accordingly, the present invention has been devised to solve such a problem, and an object of the present invention is to provide a system capable of accurately delivering a message to be delivered by variably broadcasting according to ambient noise during announcement.

또한, 본 발명의 다른 목적은, 실시간으로 수집되는 음원으로부터 사람, 차량 등이 내는 소리를 추출하고, 추출된 음향 데이터를 분석함으로써 소음도를 시간별, 일별, 주간 및 야간별로 구분하고, 딥러닝에 의하여 음향 패턴을 분류함으로써 상황에 따라 적합한 안내를 할 수 있는 시스템을 제공하는 것이다.In addition, another object of the present invention is to extract the sounds made by people, vehicles, etc. from the sound sources collected in real time, and classify the noise level by hour, day, day and night by analyzing the extracted sound data, and by deep learning It is to provide a system that can provide appropriate guidance according to the situation by classifying the acoustic pattern.

또한, 본 발명의 또 다른 목적은, 주위 소음이 단순소음인지 사고로 인한 소음인지를 구분함으로써 사고인 경우 신속하게 대응할 수 있는 시스템을 제공하는 것이다.In addition, another object of the present invention is to provide a system capable of quickly responding to an accident by distinguishing whether the ambient noise is a simple noise or an accident-induced noise.

상기한 목적을 달성하기 위하여, 본 발명의 일 실시예는, In order to achieve the above object, an embodiment of the present invention,

주위의 음향을 수집하는 입력부(3)와; an input unit 3 for collecting ambient sound;

입력부(3)를 통해 실시간 입력되는 음향을 분석하여, 음향 및 소음 데이터를 동시에 추출하는 추출부(5)와; an extraction unit 5 that analyzes the sound input in real time through the input unit 3 and simultaneously extracts sound and noise data;

추출된 음향 데이터를 시간별로 구분하여 분석하는 음성 분석부(7)와; a voice analyzer 7 for dividing and analyzing the extracted sound data by time;

추출부(5)를 통하여 추출된 음향 데이터를 딥러닝으로 학습하여 음향 패턴을 추출하고 벡터화하여 유사도를 분석하고, 분석된 음향 패턴이 일반 소음인지 사고 소음인지를 구분하는 학습부(9)와; a learning unit 9 that learns the sound data extracted through the extraction unit 5 through deep learning, extracts sound patterns, vectorizes them, analyzes the similarity, and distinguishes whether the analyzed sound patterns are general noise or accident noise;

학습부(9)에 의하여 학습된 결과에 따라 IOT 장치와 연동하여 외부 기관에 알리는 IOT 연동부(11)와;an IOT interworking unit 11 that interworks with the IOT device and informs an external organization according to the result learned by the learning unit 9;

음성 분석부(7)에 의하여 분석된 음향을 세기에 따라 구분하는 소음 조절부(13)와; 그리고 a noise control unit 13 for classifying the sound analyzed by the voice analysis unit 7 according to intensity; and

음향 분석부(7) 및 소음 조절부(13)의 분석결과에 의하여 안내 방송을 실시하는 출력부(17)를 포함하는 스마트 볼륨 조절 시스템(1)을 제공한다.It provides a smart volume control system (1) including an output unit (17) for performing a guide broadcasting according to the analysis results of the sound analysis unit (7) and the noise control unit (13).

상기한 바와 같이 본 발명의 일 실시예에 따른 스마트 음성 안내 시스템은 다음과 같은 효과가 있다.As described above, the smart voice guidance system according to an embodiment of the present invention has the following effects.

첫째, 입력부의 마이크나 데시벨 측정 센서를 이용하여 실시간 수신되는 음향으로부터 주변의 소음도를 분석함으로써, 소음도의 레벨이 높을때에는 출력부의 볼륨을 높여서 음성안내 시스템의 기능을 충실히 할 수 있도록 하며, 주변의 상황이 고요한 소음도의 레벨이 낮은 경우 출력부의 볼륨을 낮게함으로써 주변의 사람들에게 음성안내로 인한 소음을 줄여서 그 본연의 기능을 함으로써 음성안내 시스템의 순기능을 높일 수 있다.First, by analyzing the surrounding noise level from the sound received in real time using the microphone or decibel measuring sensor of the input unit, when the noise level is high, the volume of the output unit is increased to faithfully function as a voice guidance system, When the level of this quiet noise level is low, the net function of the voice guidance system can be improved by reducing the volume of the output unit to reduce the noise caused by voice guidance to people around it and performing its original function.

둘째, 일반 소음과 사고시 발생하는 소음을 구분함으로써, 폭력이나 자동차 충돌 사고와 같은 위급 상황에 대한 음향을 분석하여 시스템과 연동되는 IoT 장치를 통하여 주위의 가장 가까운 경찰서나 음급구조센터 등에 돌발상황에 대한 정보를 알림으로서 사람들을 위급상황으로부터 구할 수 있다.Second, by separating general noise from noise generated during an accident, the sound of emergency situations such as violence or car crashes are analyzed, and through the IoT device interlocked with the system, the nearest police station or sound quality rescue center, etc. Alerting information can save people from emergencies.

셋째, 기침소리 인식부를 추가로 배치함으로써 수집된 음향 데이터중 기침소리를 분석하여 해당 정류소에 기침을 하는 보행자가 있을 경우 이를 실시간으로 파악하여 관련 기관에 알릴 수 있고, 또한 해당 보행자에게 방송으로 해당 기관을 방문할 것을 안내할 수 있는 장점이 있다.Third, by additionally disposing a cough sound recognition unit, it is possible to analyze the cough sound among the collected acoustic data, and if there is a pedestrian coughing at the corresponding stop, it can be identified in real time and notified to the relevant organization, and also the corresponding pedestrian can be broadcasted to the relevant organization. It has the advantage of guiding you to visit.

도 1은 본 발명의 일 실시예에 따른 스마트 볼륨 시스템의 구조를 개략적으로 보여주는 도면이다.
도 2는 도 1에 도시된 학습부(CPU)의 구조를 개략적으로 보여주는 블록도이다.
도 3은 도 1에 도시된 음향 추출부의 구조를 개략적으로 보여주는 블록도이다.
도 4는 도 1에 도시된 IOT(Internet of things) 연동부의 구조를 개략적으로 보여주는 블록도이다.
도 5는 도 1에 도시된 스마트 볼륨 시스템에 의하여 처리되는 음파를 보여주는 도면이다.
도 6은 본 발명의 다른 실시예로서, 기침소리 인식부의 구조를 개략적으로 보여주는 블록도이다.1 is a diagram schematically showing the structure of a smart volume system according to an embodiment of the present invention.
FIG. 2 is a block diagram schematically illustrating a structure of a learning unit (CPU) illustrated in FIG. 1 .
FIG. 3 is a block diagram schematically showing the structure of the sound extraction unit shown in FIG. 1 .
FIG. 4 is a block diagram schematically illustrating the structure of an Internet of things (IOT) interworking unit shown in FIG. 1 .
FIG. 5 is a view showing sound waves processed by the smart volume system shown in FIG. 1 .
6 is a block diagram schematically showing the structure of a cough sound recognition unit as another embodiment of the present invention.

이하, 본 발명에 따른 스마트 볼륨 조절 시스템을 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, a smart volume control system according to the present invention will be described in detail with reference to the accompanying drawings.

도 1 내지 도 5에 도시된 바와 같이, 본 발명이 제안하는 스마트 볼륨 조절 시스템(1)은 실시간 입력되는 음향에서 사람 소리, 차량이 내는 소리 등을 추출하고, 동시에 추출된 음향 데이터를 분석하고 소음을 분석하여 음성 안내 시스템(1)의 볼륨을 제어하여 출력하는 시스템(1)에 관한 것이다.1 to 5 , the smart volume control system 1 proposed by the present invention extracts a human sound, a vehicle sound, etc. from real-time input sound, and simultaneously analyzes the extracted sound data and noise It relates to a system (1) that controls and outputs the volume of the voice guidance system (1) by analyzing it.

이러한 스마트 볼륨 조절 시스템(1)은,This smart volume control system (1) is,

주위의 음향을 수집하는 입력부(3)와; 입력부(3)를 통해 실시간 입력되는 음향을 분석하여, 음향 및 소음 데이터를 동시에 추출하는 추출부(5)와; 추출된 음향 데이터를 시간별로 구분하여 분석하는 음성 분석부(7)와; 추출부(5)를 통하여 추출된 음향 데이터를 딥러닝으로 학습하여 음향 패턴을 추출하고 벡터화하여 유사도를 분석하고, 분석된 음향 패턴이 일반 소음인지 사고 소음인지를 구분하는 학습부(9)와; 학습부(9)에 의하여 학습된 결과에 따라 IOT 장치와 연동하여 외부 기관에 알리는 IOT 연동부(11)와; 음성 분석부(7)에 의하여 분석된 음향을 세기에 따라 구분하는 소음 조절부(13)와; 그리고 음향 분석부(7) 및 소음 조절부(13)의 분석결과에 의하여 안내 방송을 실시하는 출력부(17)를 포함한다.an input unit 3 for collecting ambient sound; an extraction unit 5 that analyzes the sound input in real time through the input unit 3 and simultaneously extracts sound and noise data; a voice analyzer 7 for dividing and analyzing the extracted sound data by time; a learning unit 9 that learns the sound data extracted through the extraction unit 5 through deep learning, extracts sound patterns, vectorizes them, analyzes the similarity, and distinguishes whether the analyzed sound patterns are general noise or accident noise; an IOT interworking unit 11 that interworks with the IOT device and informs an external organization according to the result learned by the learning unit 9; a noise control unit 13 for classifying the sound analyzed by the voice analysis unit 7 according to intensity; And it includes an output unit 17 for performing the announcement according to the analysis results of the sound analysis unit 7 and the noise control unit 13.

보다 상세하게 설명하면,In more detail,

입력부(3)는 주위의 음향을 수집하는 전자기기로서, 예를 들면 임베디드 마이크(Embedded MIC) 혹은 데시벨 측정 센서 등을 포함한다. 따라서, 주위의 음향은 입력부(3)를 통하여 수집되며, 수집된 음향은 디지털 데이터로 변환되어 추출부(5)로 전송된다.The input unit 3 is an electronic device that collects ambient sound, and includes, for example, an embedded microphone or a decibel measuring sensor. Accordingly, ambient sound is collected through the input unit 3 , and the collected sound is converted into digital data and transmitted to the extraction unit 5 .

추출부(5)는 전송된 음향으로부터 음향 데이터 및 소음 데이터를 분리하게 된다.The extraction unit 5 separates sound data and noise data from the transmitted sound.

이러한 추출부(5)는 음향신호를 대역필터링하여 특정의 주파수만을 통과시킴으로써 음향 데이터 및 소음 데이터를 분리한다.The extraction unit 5 separates the sound data and the noise data by band-filtering the sound signal and passing only a specific frequency.

상기 추출부(5)는 도 3에 도시된 바와 같이, 입력부(3)로부터 전송된 음파신호가 입력되는 입력단(20)과; 입력단(20)으로부터 출력된 디지털 신호를 증폭하는 증폭기(22)와; 증폭된 신호중 특정 대역의 주파수만을 통과시키는 대역필터(24)와; 걸러진 특정 주파수를 검파하는 디텍터(Detector;26)와; 디텍터(26)로부터 출력된 신호를 정형화시키는 정형기(28)를 포함한다.The extraction unit 5 includes an input terminal 20 to which a sound wave signal transmitted from the input unit 3 is input, as shown in FIG. 3 ; an amplifier 22 for amplifying the digital signal output from the input terminal 20; a band filter 24 for passing only frequencies of a specific band among the amplified signals; a detector (Detector; 26) for detecting the filtered specific frequency; and a shaper 28 for shaping the signal output from the detector 26 .

이러한 구조를 갖는 추출부(5)에 있어서, 입력단(20)을 통과한 음향 신호는 증폭기(22)를 통과하는 과정에서 일정 주파수 이상으로 증폭될 수 있다.In the extraction unit 5 having such a structure, the acoustic signal passing through the input terminal 20 may be amplified to a predetermined frequency or higher while passing through the amplifier 22 .

그리고, 증폭된 신호는 상기 대역필터(24)에 의하여 특정 대역의 주파수만 출력될 수 있다. 즉, 대역필터(24)는 저항과, 복수개의 캐패시터(Capacitor)와, 다이오드에 의하여 적분회로와 미분회로를 조합한 배치구조를 갖는다.In addition, the amplified signal may output only a frequency of a specific band by the band filter 24 . That is, the band filter 24 has an arrangement structure in which an integrating circuit and a differentiating circuit are combined by a resistor, a plurality of capacitors, and a diode.

그리고, 제 1캐패시터 및 제 2캐패시터의 경계 주파수는 아래의 수식에 의하여 결정될 수 있다. 즉,And, the boundary frequency of the first capacitor and the second capacitor may be determined by the following equation. in other words,

제 1캐패시터의 경계 주파수: f1=1/2*π*R*C1 -------- 수식 1The boundary frequency of the first capacitor: f1=1/2*π*R*C1 -------- Equation 1

제 2캐패시터의 경계 주파수: f2=1/2*π*R*C2 -------- 수식 2The boundary frequency of the second capacitor: f2=1/2*π*R*C2 -------- Equation 2

(f1: 제 1경계 주파수, f2: 제 2경계 주파수, R: 저항, C1, C2: 제1 및 제 2캐패시터)(f1: first boundary frequency, f2: second boundary frequency, R: resistance, C1, C2: first and second capacitors)

상기 수식 1 및 2에서 알 수 있는 바와 같이, 캐패시터 혹은 저항의 용량을 적절하게 가변시킴으로써 제 1경계 주파수와 제 2경계 주파수를 설정할 수 있다.As can be seen from Equations 1 and 2, the first boundary frequency and the second boundary frequency can be set by appropriately varying the capacitance of the capacitor or resistor.

따라서, 상기 목표하는 주파수(f)가 f1〈f〈f2의 조건을 만족시킬 경우, 즉, 제 1 및 제 2경계 주파수 대역의 중간 대역에 해당하는 주파수만을 통과시킬 수 있다.Accordingly, when the target frequency f satisfies the condition of f1<f<f2, that is, only frequencies corresponding to the middle bands of the first and second boundary frequency bands may pass.

이와 같은 방식으로, 소음 데이터와 음향 데이터를 구분하는 기준 주파수를 설정함으로써 소음 데이터를 분리할 수 있다.In this way, noise data can be separated by setting a reference frequency that separates noise data and sound data.

예를 들면, 20Hz 내지 100Hz 주파수 대역의 신호는 음향 데이터로 설정하고, 100Hz 내지 200Hz 주파수 대역의 신호는 소음 데이터로 설정하는 방식이다.For example, a signal in a frequency band of 20 Hz to 100 Hz is set as sound data, and a signal in a frequency band of 100 Hz to 200 Hz is set as noise data.

경우에 따라서는 대역필터(24)를 복수개 배치하고 각 대역필터(24)의 통과 주파수 대역을 다중 채널로 함으로써 보다 다양한 기준으로 소음 데이터를 분리할 수 있다.In some cases, by disposing a plurality of band filters 24 and making the pass frequency band of each band filter 24 into multiple channels, it is possible to separate noise data according to more various criteria.

그리고, 추출된 음향 데이터는 음성 분석부(7)에 의하여 시간별로 특징적인 패턴이 분석된다. Then, the extracted acoustic data is analyzed for a characteristic pattern for each time by the voice analysis unit 7 .

즉, 추출된 음향 데이터를 시간 별, 일별, 주간 및 야간으로 구분하여 저장한다. 예를 들면, 음향 데이터에 시간 데이터를 결합함으로써 각 음향 데이터의 수집시간을 파악할 수 있다.That is, the extracted sound data is stored by dividing it into hourly, daily, daytime and nighttime categories. For example, by combining the time data with the sound data, it is possible to determine the collection time of each sound data.

이는 음성 분석부(7)와 디지털 타이머(Digital timer)와 연동함으로써 음향 데이터에 시간 데이터를 결합할 수 있다.This may combine time data with sound data by interworking with the voice analyzer 7 and a digital timer.

이와 같이 시간 데이터를 결합함으로써 음향 데이터를 시간 별로 구분하여 저장할 수 있다.By combining the time data in this way, the sound data can be stored separately by time.

그리고, 이러한 음향 데이터는 학습부(7)에 의하여 패턴을 분석하여 특징적인 패턴을 추출하고, 벡터화한다. 즉, 학습부(9)에 의하여 벡터화되고 딥러닝 방식으로 분석될 수 있다. Then, the sound data is analyzed by the learning unit 7 to extract a characteristic pattern, and vectorized. That is, it can be vectorized by the learning unit 9 and analyzed in a deep learning manner.

보다 상세하게 설명하면, 학습부(9)는 음향 패턴을 딥러닝에 의하여 분석함으로써 분류하여 소음과 일반적인 음향을 구분하게 된다. 이러한 학습부(9)는 음향분석엔진이 탑재된 중앙처리장치(CPU)에 의하여 처리될 수 있다.In more detail, the learning unit 9 classifies the sound pattern by analyzing the sound pattern by deep learning to distinguish the noise from the general sound. The learning unit 9 may be processed by a central processing unit (CPU) equipped with an acoustic analysis engine.

학습부(9)는 음향 데이터를 딥러닝 방식에 의하여 분류함으로써 유사도에 따라 음향을 분류하는 중앙처리부(10)와; 중앙처리부(10)에 의하여 분류된 음향 데이터를 일반 소음인지 사고 소음인지를 구분하는 음향구분모듈(30)을 포함한다.The learning unit 9 includes: a central processing unit 10 for classifying the sound according to the degree of similarity by classifying the sound data by a deep learning method; and a sound classification module 30 for classifying the sound data classified by the central processing unit 10 as normal noise or accident noise.

중앙처리부(10)는, 도 2에 도시된 바와 같이, 음향분석엔진과, GPU와, ISP(Image Signal Processor), ALSA(Advanced Linux Sound Architecture), RTSP(Real Time Streaming Protocol)로 구성되고, 램(RAM), LAN 이나 Beacon관 연동한다.The central processing unit 10, as shown in FIG. 2, is composed of an acoustic analysis engine, a GPU, an ISP (Image Signal Processor), ALSA (Advanced Linux Sound Architecture), RTSP (Real Time Streaming Protocol), and a RAM (RAM), LAN or Beacon.

이러한 중앙처리부(10)에 있어서, 음향의 패턴 분석은 다양한 알고리즘을 사용할 수 있는 바, 예를 들면 딥러닝(Deep learning)방식에 의하여 음향의 주파수 대역, 파형 등을 비교하여 그 유사도를 판단함으로써 패턴을 분석하는 방식이다.In the central processing unit 10, various algorithms can be used to analyze the pattern of sound, for example, by comparing frequency bands, waveforms, etc. of sound by a deep learning method and determining the similarity of the pattern. method to analyze.

즉, 딥러닝은 컴퓨터가 여러 데이터를 이용해 인간처럼 스스로 학습할 수 있게 하기 위해 인공 신경망(ANN: artificial neural network)을 기반으로 구축한 한 기계 학습방법이다.In other words, deep learning is a machine learning method built on the basis of artificial neural networks (ANNs) to enable computers to learn on their own like humans using multiple data.

인공 신경망을 이용하면 음향 데이터의 분류(classification) 및 군집화(clustering)가 가능한 바, 분류나 군집화를 원하는 데이터 위에 여러 가지 층(layer)을 얹어서 유사도 판단을 실시할 수 있다.Since classification and clustering of acoustic data are possible using an artificial neural network, similarity determination can be performed by placing various layers on data desired for classification or clustering.

즉, 인공 신경망으로 음향 데이터를 벡터화하고, 파형의 특징을 추출하고 그 특징을 다시 다른 기계학습 알고리즘의 입력값으로 사용하여 파형 별로 분류나 군집화를 함으로써 유사도를 판단할 수 있다. That is, the degree of similarity can be determined by vectorizing acoustic data with an artificial neural network, extracting waveform features, and using the features as input values for other machine learning algorithms to classify or cluster for each waveform.

이러한 인공 신경망은 심층 신경망을 포함하는 바, 심층 신경망은 신경망 알고리즘 중에서 여러 개의 층으로 이루어진 신경망을 의미한다. The artificial neural network includes a deep neural network, and the deep neural network refers to a neural network composed of several layers among neural network algorithms.

즉, 인공 신경망은 다층으로 구성되는 바, 각각의 층은 여러 개의 노드로 구성되고, 각 노드에서는 실제로 음향 데이터의 파형을 분류하는 연산이 일어나며, 이 연산 과정은 인간의 신경망을 구성하는 뉴런에서 일어나는 과정을 모사하도록 설계된다.That is, the artificial neural network is composed of multiple layers, and each layer is composed of several nodes, and an operation that actually classifies the waveform of the acoustic data occurs at each node. designed to mimic the process.

노드는 일정 크기 이상의 자극을 받으면 반응을 하는데, 그 반응의 크기는 입력 값과 노드의 계수(또는 가중치, weights)를 곱한 값과 대략 비례한다. 일반적으로 노드는 여러 개의 음향 데이터를 입력받으며 입력 갯수만큼의 계수를 갖는다. 따라서, 이 계수를 조절함으로써 여러 입력값에 서로 다른 가중치를 부여할 수 있다. When a node receives a stimulus of a certain size or more, it responds, and the magnitude of the response is roughly proportional to the value multiplied by the input value and the node's coefficients (or weights, weights). In general, a node receives several sound data and has a coefficient equal to the number of inputs. Therefore, different weights can be given to different input values by adjusting this coefficient.

최종적으로 곱한 값들은 전부 더해지고 그 합은 활성 함수(activation function)의 입력으로 들어가게 된다. 활성 함수의 결과가 노드의 출력에 해당하며 이 출력값이 궁극적으로 분류나 회귀 분석에 쓰인다.Finally, the multiplied values are all added and the sum is fed into the activation function. The result of the activation function corresponds to the output of the node, and this output is ultimately used for classification or regression analysis.

각 층은 여러 개의 노드로 이루어지며 음향 데이터의 입력에 따라 각 노드의 활성화/비활성화 여부가 결정된다. 이때, 입력 데이터는 첫 번째 층의 입력이 되며, 그 이후엔 각 층의 출력이 다시 다음 층의 입력이 되는 방식이다.Each layer consists of several nodes, and whether each node is activated/deactivated is determined according to the input of sound data. At this time, the input data becomes the input of the first layer, and after that, the output of each layer becomes the input of the next layer again.

모든 계수는 음향 데이터의 파형 학습 과정에서 계속 조금씩 변하는데, 결과적으로 각 노드가 어떤 입력을 중요하게 여기는지를 반영한다. 그리고 신경망의 학습(training)은 이 계수를 업데이트하는 과정이다.All coefficients change little by little in the process of learning the waveform of the acoustic data, and as a result, reflect which input each node considers important. And training of the neural network is the process of updating this coefficient.

음향 데이터의 파형 학습시 이러한 심층 신경망에서는 각 층마다 서로 다른 층위의 특징이 학습된다.When learning the waveform of acoustic data, in such a deep neural network, different layer features are learned for each layer.

즉, 낮은 층위의 특징은 단순하고 구체적인 특징이 학습되며 {예: 음향 데이터의 파형을 구성하는 곡선 형상(C)}, 높은 층위의 특징은 더욱 복잡하고 추상적인 특징이 학습된다. {예: 파형의 높이(H), 간격(R), 곡률(P)}That is, simple and specific features are learned for low-level features (eg, a curved shape (C) constituting a waveform of acoustic data), and more complex and abstract features are learned for high-level features. {Example: Waveform Height (H), Spacing (R), Curvature (P)}

이런 추상화 학습과정을 통해 심층 신경망이 고차원의 음향 데이터를 이해하며, 이 과정에는 수 억, 수 십억 개의 계수가 관여하게 된다. (이 과정에서 비선형함수가 사용된다.)Through this abstraction learning process, the deep neural network understands high-dimensional acoustic data, and hundreds of millions of coefficients are involved in this process. (A non-linear function is used in this process.)

또한, 심층 신경망은 데이터를 이용해 데이터의 잠재적인 구조(latent structures)를 파악할 수 있다. 즉, 음향 데이터의 파형의 높이(H), 간격(R), 피크부의 곡률(P) 등 잠재적인 구조를 파악할 수 있다. 이를 통해 데이터가 라벨링되어 있지 않아도 데이터간의 유사성을 효과적으로 파악할 수 있으며, 결과적으로 심층 신경망은 음향 데이터의 군집화에 효과적이다.In addition, deep neural networks can use the data to identify the latent structures of the data. That is, potential structures such as the height (H) of the waveform of the acoustic data, the interval (R), and the curvature (P) of the peak can be grasped. Through this, similarity between data can be effectively identified even if the data is not labeled, and as a result, deep neural networks are effective for clustering acoustic data.

예를 들어, 신경망을 이용해 대량의 음향 데이터를 입력받아 비슷한 음향 데이터끼리 모아서 분류할 수 있다.For example, a large amount of acoustic data may be input using a neural network, and similar acoustic data may be collected and classified.

그리고, 라벨링이 되어있지 않은 데이터를 학습하는 경우에도 신경망은 음향 데이터의 특징을 자동적으로 추출할 수 있다. 이 자동 추출은 여러 가지 방법이 있는데, 보통 이 과정은 신경망을 통과시켰을 때의 출력이 입력과 같아지도록 학습하게 된다. And, even when learning the unlabeled data, the neural network can automatically extract the characteristics of the acoustic data. There are several methods for this automatic extraction, and in general, this process learns to make the output equal to the input when passed through the neural network.

라벨이 어떤 종류이든지(입력을 그대로 사용/별도의 라벨을 사용) 신경망은 입력과 출력의 상관관계를 찾는다. 경우에 따라서는 라벨링된 데이터로 신경망을 어느 정도 학습시킨 뒤 라벨링이 되어있지 않은 데이터를 추가하여 계속 학습시킬 수도 있다. 이 방법을 이용하면 신경망의 성능을 극대화할 수 있다.No matter what kind of label is (use the input as is/use a separate label), the neural network finds the correlation between the input and the output. In some cases, after training the neural network to some extent with labeled data, it is possible to continue learning by adding unlabeled data. Using this method, the performance of the neural network can be maximized.

심층 신경망의 마지막 층은 출력층이다. 출력층의 활성 함수는 로지스틱(logistic) 혹은 소프트 맥스(softmax)인 경우가 대부분이며 출력층에서는 최종적으로 특정 라벨의 확률을 구할 수 있다. 예를 들어 음향 데이터를 입력하였을 때 파형의 형상이 짧고 조밀한지, 길고 완만한지 등을 각각의 확률로 구할 수 있다.The last layer of a deep neural network is the output layer. In most cases, the activation function of the output layer is logistic or softmax, and the probability of a specific label can be finally obtained from the output layer. For example, when sound data is input, whether the shape of the waveform is short and dense, long and smooth, etc. can be obtained with respective probabilities.

우선 학습이 시작되기 전에 뉴럴넷의 모든 계수를 초기화한다. 그리고 음향 데이터를 반복적으로 입력하여 학습을 진행한다. 만일 학습이 원활하게 진행되었다면 계수는 적절한 값으로 업데이트 되어 있을 것이고, 이 인공 신경망으로 각종 분류와 예측이 가능하다.First, all coefficients of the neural net are initialized before training starts. Then, the learning proceeds by repeatedly inputting sound data. If the learning proceeds smoothly, the coefficients will be updated to appropriate values, and various classifications and predictions are possible with this artificial neural network.

학습 과정 내부에선 이러한 계수의 업데이트 과정이 반복적으로 일어난다.In the learning process, the updating process of these coefficients occurs repeatedly.

계수 업데이트의 원리는 우선 계수를 추정하고 그 계수를 사용했을 때 발생하는 에러를 측정한 뒤 그 에러에 기반해서 계수를 약간씩 업데이트 하는 방식이다.The principle of coefficient update is to estimate the coefficient first, measure the error that occurs when the coefficient is used, and then slightly update the coefficient based on the error.

이때, 신경망의 여러 계수를 합쳐서 모델이라고 부르며, 모델은 초기화 된 상태일 수도 있고, 학습이 완료된 상태일 수도 있다.At this time, the multiple coefficients of the neural network are collectively called a model, and the model may be in an initialized state or in a state in which learning is completed.

초기화 된 모델은 의미있는 작업을 못하지만 학습이 진행될수록 모델은 임의의 값이 아닌, 실제와 유사한 결과를 출력하게 된다.The initialized model does not perform any meaningful work, but as the learning progresses, the model outputs results similar to the actual value rather than a random value.

이는 인공 신경망이 데이터가 입력되기 전에는 아무것도 모르는 상태이기 때문이며, 계수를 임의의 값으로 초기화하는 이유도 마찬가지이다. 그리고 데이터를 읽어가면서 계수를 조금씩 올바른 방향으로 업데이트하게 된다.This is because the artificial neural network does not know anything before data is input, and the reason for initializing coefficients to random values is also the same. And as the data is read, the coefficients are updated little by little in the correct direction.

이러한 업데이트 과정을 통하여 인공 신경망은 입력된 음향 데이터들을 분류함으로써 유사한 음향 데이터들을 군집화할 수 있다. 그리고, 군집화된 음향 데이터는 데이터 베이스에 등록된다.Through this update process, the artificial neural network can group similar acoustic data by classifying the input acoustic data. Then, the clustered sound data is registered in the database.

그리고, 음향구분모듈(30)은 중앙처리부(10)에 의하여 유사도에 따라 분류된 음향 데이터를 일반 소음 혹은 사고 소음으로 구분한다.Then, the sound classification module 30 classifies the sound data classified according to the degree of similarity by the central processing unit 10 into general noise or accident noise.

즉, 음향 구분모듈(30)은 분류된 음향 데이터와 데이터 베이스에 등록된 음향 데이터를 비교함으로써 어느 종류의 음향과 유사한지를 판단하게 된다. 물론 딥러닝 방식에 의하여 비교 분석하는 것도 가능하다.That is, the sound classification module 30 compares the classified sound data with the sound data registered in the database to determine which type of sound is similar. Of course, it is also possible to compare and analyze by the deep learning method.

그리고, 유사정도에 따라 일반 소음과 사고소음으로 구분하게 된다.And, according to the degree of similarity, it is divided into general noise and accident noise.

예를 들면, 차량이 충돌하는 음향 데이터과 유사하면, 음향 구분모듈(30)은 수집된 음향이 차량 충돌로 인한 음향으로 판단하게 된다.For example, if the sound data in which the vehicle collides is similar, the sound classification module 30 determines that the collected sound is the sound caused by the vehicle collision.

이와 같이 음향이 사고유형별로 구분되고 폭력이나 자동차 충돌 사고와 같은 위급 상황으로 판단되면 IOT 장치와 연동하여 관련 기관에 알릴 수 있다.In this way, if the sound is classified by accident type and it is determined as an emergency such as violence or a car crash, it can be linked with the IOT device to notify the relevant organization.

이러한 IOT 연동부(11)는 음향 구분모듈(30)에 의하여 판단된 음향에 관련된 기관을 검색하는 검색모듈(32)과; 검색된 기관에 신호를 전송하여 사고 발생을 알리는 알림모듈(34)을 포함한다.The IOT link unit 11 includes a search module 32 for searching for an institution related to the sound determined by the sound classification module 30; and a notification module 34 to notify the occurrence of an accident by transmitting a signal to the searched institution.

보다 상세하게 설명하면, 검색모듈(32)은 차량 사고 관련한 기관들을 데이터 베이스에서 인출하게 된다. 예를 들면, 병원, 경찰서, 보험사 등을 리스트에서 인출하게 된다.In more detail, the search module 32 fetches organizations related to vehicle accidents from the database. For example, hospitals, police stations, insurance companies, etc. will be withdrawn from the list.

그리고, 알림모듈(34)은 인출된 기관들에 메일, 메시지 등을 통하여 차량 사고가 발생하였음을 알리게 된다. 이때, IOT 장치가 배치된 위치에 대한 GPS 신호도 같이 전송됨으로써 해당 기관들은 사고 위치를 실시간으로 정확하게 파악할 수 있다. Then, the notification module 34 notifies the withdrawn institutions of the occurrence of a vehicle accident through mail, message, or the like. At this time, since the GPS signal for the location where the IOT device is located is also transmitted, the relevant organizations can accurately determine the location of the accident in real time.

한편, 음향 데이터는 소음 조절부(13)에 의하여 분석됨으로써 음향 세기에 따라 단계별로 구분하게 된다.On the other hand, the sound data is analyzed by the noise control unit 13 to be divided into stages according to the sound intensity.

예를 들면, 소음 조절부(13)는 소음의 세기를 평가하는 대역필터와 연동함으로써 음향 데이터를 대역필터를 통과시킴으로써 복수 단계로 구분할 수 있다. For example, the noise control unit 13 may divide the sound data into a plurality of stages by passing the band filter by interworking with the band filter for evaluating the intensity of the noise.

가령 1 내지 10단계로 구분하고, 가장 소음이 낮은 경우는 1단계로 설정하고, 점차 증가한 후 가장 소음이 높은 경우는 10단계로 설정하는 방식이다.For example, it is divided into steps 1 to 10, the case with the lowest noise is set to step 1, and the case where the noise is the highest after gradually increasing is set to step 10.

이와 같이 음향 데이터를 소음 세기에 따라 단계별로 구분하게 된다.In this way, the sound data is divided into stages according to the noise intensity.

그리고, 이와 같이 구분된 소음 세기에 따라 출력부(15)는 안내방송을 실시하게 된다.And, according to the noise intensity divided in this way, the output unit 15 performs the announcement.

즉, 주위 음향에 대한 소음도가 낮은 경우 소음 조절부(13)는 출력부(15)의 출력을 연동하여 낮추어 안내 방송을 실시하게 되고, 반대로 소음도가 높은 경우 출력을 높혀서 안내방송을 실시하게 된다.That is, when the noise level with respect to the surrounding sound is low, the noise control unit 13 lowers the output of the output unit 15 in conjunction to perform the announcement, and on the contrary, when the noise level is high, the output is increased and the announcement is performed.

이와 같이 주위 소음도에 따라 안내 방송의 세기를 조절함으로써 효과적으로 방송을 실시할 수 있다. 이때 출력부(15)는 스피커(19) 등을 포함한다.As described above, by adjusting the intensity of the announcement according to the level of ambient noise, broadcasting can be effectively performed. At this time, the output unit 15 includes a speaker 19 and the like.

한편, 본 발명의 다른 실시예로서 기침소리 인식부(40)를 추가로 배치함으로써 기침 소리를 감지하여 감기, 바이러스 독감 여부 등을 실시간으로 판단할 수 있다.On the other hand, as another embodiment of the present invention, by additionally disposing the cough sound recognition unit 40, it is possible to detect a cough sound and determine whether a cold or a virus flu is present in real time.

도 6에 도시된 바와 같이, 이러한 기침소리 인식부(40)는 입력부를 통하여 수집된 음향중 기침소리를 인식하고 IOT 연동부(11)를 통하여 관련 기관에 안내하게 된다.As shown in FIG. 6 , the cough sound recognition unit 40 recognizes a cough sound among the sounds collected through the input unit and guides it to a related institution through the IOT link unit 11 .

그리고 기침소리를 인식하는 방식은 다양하며, 예를 들면, STT(Spech To Text) 를 이용한 방식에 의하여 기침소리를 인식할 수 있다.In addition, there are various methods for recognizing a cough sound, and for example, a cough sound may be recognized by a method using STT (Sph To Text).

즉, 기침소리 인식부(40)는 입력부를 통하여 수집된 음향 데이터로부터 패턴을 분석하여 기침에 해당하는 음향 데이터를 추출하는 벡터 추출모듈(42)과; 기계학습, 혹은 딥러닝에 의하여 학습된 기침 음향 데이터가 저장된 데이터 베이스(48)와; 벡터 추출모듈(42)에 의하여 추출된 음향 데이터와 데이터 베이스(48)에 저장된 기침 음향 데이터를 비교하여 기침여부를 판단하는 판단모듈(44)과; 기침으로 판단되면, IOT 연동부(11)에 신호를 전송함으로써 알림모듈(34)을 통하여 관련 기관에 알리는 출력모듈(46)을 포함한다.That is, the cough sound recognition unit 40 includes a vector extraction module 42 for extracting sound data corresponding to cough by analyzing a pattern from the sound data collected through the input unit; a database 48 in which cough sound data learned by machine learning or deep learning is stored; a determination module 44 for comparing the sound data extracted by the vector extraction module 42 with the cough sound data stored in the database 48 to determine whether or not there is a cough; When it is determined as coughing, it includes an output module 46 that notifies the relevant institution through the notification module 34 by transmitting a signal to the IOT interworking unit 11 .

이러한 기침소리 인식부(40)에 있어서, 우선은 다양한 기침소리를 수집하여 기계학습 혹은 딥러닝에 의하여 학습을 시켜서 데이터 베이스(48)에 저장한다.In the cough sound recognition unit 40 , first, various cough sounds are collected, learned by machine learning or deep learning, and stored in the database 48 .

그리고, 벡터 추출모듈(42)은 외부에서 마이크를 통하여 수집된 음향 데이터를 분석하여 파형 주파수의 높이(H), 간격(R) 등을 수치화하고 벡터값으로 변환시킨다. 이러한 벡터 추출모듈(42)은 시간입력모듈(49)과 연결됨으로써 특정 기간내에만 기침 관련 음향 데이터를 벡터화할 수 있다.Then, the vector extraction module 42 digitizes the height (H), the interval (R), etc. of the waveform frequency by analyzing the sound data collected through the microphone from the outside, and converts it into a vector value. The vector extraction module 42 is connected to the time input module 49 to vectorize cough-related sound data only within a specific period.

예를 들면, 독감 유행기와 같이 질병이 전염되는 기간에만 벡터 추출모듈(42)이 음향 데이터를 벡터화할 수 있도록 함으로써 보다 효과적으로 기침소리를 인식할 수 있다.For example, by allowing the vector extraction module 42 to vectorize the acoustic data only during a period during which a disease is transmitted, such as during an influenza epidemic, a cough sound can be recognized more effectively.

그리고, 판단모듈(44)은 벡터값으로 변환된 음향 데이터와 데이터 베이스(48)에 저장된 기침 데이터를 상호 비교한다. Then, the determination module 44 compares the sound data converted into a vector value with the cough data stored in the database 48 .

비교결과, 유사한 기침 데이터가 검출되면 이를 기침소리로 인식한다.As a result of comparison, if similar cough data is detected, it is recognized as a cough sound.

그리고, 기침소리인 경우에는 출력모듈(46)이 IOT 연동부(11)에 신호를 전송함으로써 알림모듈(34)을 통하여 병원, 보건소, 시청, 방역기관 등 관련 기관에 알리게 된다.And, in the case of a cough sound, the output module 46 transmits a signal to the IOT linkage unit 11 to notify the relevant institutions such as hospitals, public health centers, city halls, and quarantine institutions through the notification module 34 .

따라서, 해당 기관에서는 GPS 신호에 의하여 기침소리가 발생한 해당 정류장의 위치를 파악하고 독감 바이러스의 전염 등에 대한 대책을 세울수 있다.Accordingly, the institution can determine the location of the stop where the coughing sound is generated by the GPS signal and take measures against the spread of the flu virus.

그리고, 해당 정류자의 스피커를 통하여 안내 방송을 실시함으로써 기침을 한 해당 보행자에게 병원 등 관련 기관을 방문할 것을 안내할 수도 있다.In addition, by broadcasting a guide through the speaker of the commutator, the coughing pedestrian may be instructed to visit a related institution, such as a hospital.

상기한 스마트 볼륨 조절 시스템은 마이크로 프로세서 등 다양한 하드웨어 및 이를 실행할 수 있는 소프트웨어로 구성되며, 이러한 컴퓨터 구성요소를 통하여 실행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. The smart volume control system is composed of various hardware such as a microprocessor and software capable of executing the same, and is implemented in the form of program instructions that can be executed through these computer components and recorded on a computer-readable recording medium. The computer-readable recording medium may include program instructions, data files, data structures, etc. alone or in combination.

Claims

an input unit 3 for collecting ambient sound;
an extraction unit 5 that analyzes the sound input in real time through the input unit 3 and simultaneously extracts sound and noise data;
a voice analyzer 7 for dividing and analyzing the extracted sound data by time;
a learning unit 9 for learning the acoustic data extracted through the extraction unit 5 through deep learning, extracting acoustic patterns, vectorizing them, analyzing the similarity, and classifying whether the analyzed acoustic patterns are general noise or accident noise;
an IOT interworking unit 11 that interworks with the IOT device and informs an external organization according to the result learned by the learning unit 9;
a noise control unit 13 for classifying the sound analyzed by the voice analysis unit 7 according to intensity; and
A smart volume control system (1) including an output unit (17) for performing a guide broadcasting according to the analysis results of the sound analysis unit (7) and the noise control unit (13).

The method of claim 1,
The extraction unit 5 includes an input terminal 20 to which the sound wave signal transmitted from the input unit 3 is input; an amplifier 22 for amplifying the digital signal output from the input terminal 20; a band filter 24 for passing only frequencies of a specific band among the amplified signals; a detector 26 for detecting the filtered specific frequency; A smart volume control system (1) including a shaper (28) for shaping the signal output from the detector (26), and extracting a specific band frequency by the following equation.
The boundary frequency of the first capacitor: f1=1/2*π*R*C1 -------- Equation 1
Second capacitor boundary frequency: f2=1/2*π*R*C2 -------- Equation 2
(f1: first boundary frequency, f2: second boundary frequency, R: resistance, C1, C2: first and second capacitors)

The method of claim 1,
The learning unit 9 includes: a central processing unit 10 for classifying the sound according to the degree of similarity by classifying the sound data by a deep learning method; A smart volume control system (1) including a sound classification module (30) for classifying sound data classified by the central processing unit (10) whether it is general noise or accidental noise.

The method of claim 1,
The noise control unit 13 includes a band filter that evaluates the intensity of the noise step by step,
A smart volume control system (1) that distinguishes the intensity of noise by a certain range in the order of low noise level to high level level.

4. The method of claim 3,
The IOT linkage unit 11 includes a search module 32 for searching for an institution related to the sound determined by the sound classification module 30; A smart volume control system (1) including a notification module (34) notifying the occurrence of an accident by transmitting a signal to the searched institution.

6. The method of claim 5,
The search module 32 is a smart volume control system (1) that searches for and selects an organ related to the corresponding sound from the list of external organs stored in the database (48).

6. The method of claim 5,
It further includes a cough sound recognition unit 40, and the cough sound recognition unit 40 analyzes a pattern from the acoustic data collected through the input unit and extracts acoustic data corresponding to the coughing vector extraction module 42; a database 48 in which cough sound data learned by machine learning or deep learning is stored; a determination module 44 for comparing the sound data extracted by the vector extraction module 42 with the cough sound data stored in the database 48 to determine whether or not there is a cough; When it is determined as coughing, it includes an output module 46 that notifies the relevant institution through the notification module 34 by transmitting a signal to the IOT linkage unit 11,
A smart volume control system that analyzes sound patterns and performs vector extraction only during a specific period by interlocking the vector extraction module 42 with the time input module 49.