KR102416175B1

KR102416175B1 - Appratus and method for accident detection

Info

Publication number: KR102416175B1
Application number: KR1020200051008A
Authority: KR
Inventors: 김희중; 유병관; 권한준; 김춘경; 임태훈; 김성도; 신영달; 김준현
Original assignee: 한국도로공사; (주)에이엔제이솔루션
Priority date: 2020-04-27
Filing date: 2020-04-27
Publication date: 2022-07-05
Also published as: KR20210132514A

Abstract

원격 위치에서 발생한 사고를 검지하는 장치 및 방법이 개시된다. 사고음 딥러닝(accident sound deep learning)을 통한 음향기반(acoustic) 사고 검지가 가능하여 도로 상의 사고를 더욱 신속하고 정확하게 파악 및 처리 할 수 있는 사고 검지 장치 및 방법이 제공된다.An apparatus and method for detecting an accident occurring at a remote location are disclosed. Provided are an accident detection device and method capable of detecting and handling accidents on the road more quickly and accurately by enabling acoustic accident detection through accident sound deep learning.

Description

Accident detection device and method

본 발명은 원격 위치에서 발생한 사고를 검지하는 장치 및 방법에 관한 것으로서, 구체적으로 사고음 딥러닝(accident sound deep learning)을 통한 음향기반(acoustic) 사고 검지 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for detecting an accident occurring at a remote location, and more particularly, to an acoustic accident detection apparatus and method through accident sound deep learning.

많은 자동차가 도로를 주행하다 보면 각종 원인으로 인해 교통사고가 발생되어 인명 및 재산상의 피해가 발생되게 된다. 특히 고속도로와 같이 차량이 고속으로 주행 중인 영역에는, 최초 사고 발생 시 2차 사고의 위험이 매우 높기 때문에, 후속 주행 차량에 사고 발생을 알릴 수 있는 사고 검지 시스템이 필요하다. When many cars are driving on the road, traffic accidents occur due to various causes, resulting in damage to life and property. In particular, in an area where a vehicle is traveling at high speed, such as a highway, since the risk of a secondary accident is very high when an initial accident occurs, an accident detection system capable of notifying the occurrence of an accident to a subsequent driving vehicle is required.

최근 영상을 통한 사고 검지 시스템이 널리 사용되고 있으나, 기상 상황이 좋지 않거나, 화재 발생 시 연기로 인해 카메라의 시야가 확보되지 못하는 등의 환경 요인에 따라 사고를 오검지하는 사례가 증가하고 있다.Although video-based accident detection systems have been widely used in recent years, the number of cases of erroneous detection of accidents is increasing depending on environmental factors such as bad weather conditions or the inability to secure a camera view due to smoke in the event of a fire.

영상 자료에 기반한 사고 검지 시스템은 즉각적인 도로 정보를 전달할 수 있는 장점이 있으나, 상술한 문제점으로 인해 사고 검지 성능에 필연적인 한계가 존재한다.An accident detection system based on image data has the advantage of delivering immediate road information, but there is an inevitable limitation in accident detection performance due to the above-described problems.

본 발명의 일 과제는 전술한 문제점을 해결하기 위하여 도로에서 수집한 사운드에 기반하여 사고 발생 여부를 감지하는 사고 검지 장치 및 방법을 제공하는 것이다. An object of the present invention is to provide an accident detection device and method for detecting whether an accident has occurred based on a sound collected from a road in order to solve the above problems.

본 발명의 일 과제는 사고음 딥러닝을 통해 사고 발생 여부를 신속하게 검지할 수 있는 사고 검지 장치 및 방법을 제공하는 것이다.An object of the present invention is to provide an accident detection device and method capable of quickly detecting whether an accident has occurred through deep learning of an accident sound.

본 발명에서 이루고자 하는 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급하지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The technical problems to be achieved in the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the description below. will be able

상기 과제를 달성하기 위하여, 일 실시예는 원격 위치에서 발생한 사고를 검지하는 장치로서, 원격 위치의 근방에 배치된 사운드 수집기를 통해 감지한 입력 사운드의 스트림을 저장하는 메모리 버퍼 및 적어도 하나의 프로세서를 포함하고, 프로세서는, 입력 사운드의 스트림으로부터 일련의 서브 사운드를 획득하고, 일련의 서브 사운드에 기반하여 사고음 학습 모델을 실행하기 위한 일련의 입력 데이터를 생성하고, 일련의 입력 데이터에 기반한 사고음 학습 모델의 실행 결과에 따라 입력 사운드가 사고음을 포함하는 지 여부에 대한 검지 결과를 결정하도록 구성되는, 사고 검지 장치를 제공한다.In order to achieve the above object, an embodiment is an apparatus for detecting an accident occurring at a remote location, comprising: a memory buffer for storing a stream of input sound sensed through a sound collector disposed in the vicinity of the remote location; and at least one processor; wherein the processor is configured to: obtain a set of sub sounds from the stream of input sounds, generate a set of input data for running a thought sound learning model based on the set of sub sounds, and generate a set of thought sounds based on the set of input data. An accident detection device is provided, configured to determine a detection result as to whether the input sound includes an accident sound according to the execution result of the learning model.

상기 과제를 달성하기 위하여, 일 실시예는 원격 위치에서 발생한 사고를 검지하는 방법으로서, 원격 위치의 근방에 배치된 사운드 수집기를 통해 감지한 입력 사운드의 스트림을 획득하는 단계, 입력 사운드의 스트림으로부터 일련의 서브 사운드를 획득하는 단계, 일련의 서브 사운드에 기반하여 사고음 학습 모델을 실행하기 위한 일련의 입력 데이터를 생성하는 단계 및 일련의 입력 데이터에 기반한 사고음 학습 모델의 실행 결과에 따라 입력 사운드가 사고음을 포함하는 지 여부에 대한 검지 결과를 결정하는 단계를 포함하는 사고 검지 방법 제공한다.In order to achieve the above object, an embodiment is a method of detecting an accident occurring at a remote location, comprising: acquiring a stream of input sound sensed through a sound collector disposed in the vicinity of the remote location; Acquiring sub sounds of , generating a series of input data for executing a thought sound learning model based on the series of sub sounds, and the input sound according to the execution result of the thinking sound learning model based on the series of input data It provides an accident detection method comprising the step of determining a detection result as to whether or not an accident sound is included.

본 발명에서 이루고자 하는 기술적 과제들의 해결 수단은 이상에서 언급한 해결 수단들로 제한되지 않으며, 언급하지 않은 또 다른 해결 수단들은 아래의 기재로부터 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The solutions to the technical problems to be achieved in the present invention are not limited to the solutions mentioned above, and other solutions not mentioned are clear to those of ordinary skill in the art to which the present invention belongs from the description below. can be understood clearly.

실시예에 의하면 종래 운영되었던 영상 기반 사고 분석 시스템의 단점을 보완하여, 도로 상의 사고를 더욱 신속하고 정확하게 파악 및 처리 할 수 있다.According to the embodiment, it is possible to more quickly and accurately identify and handle accidents on the road by supplementing the disadvantages of the conventionally operated image-based accident analysis system.

실시예에 의하면 딥러닝 알고리즘에 기반한 사고음 학습 모델을 이용하여 사고 검지의 정확도가 제고된다.According to the embodiment, the accuracy of accident detection is improved by using a thought sound learning model based on a deep learning algorithm.

본 발명의 효과는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 효과들은 아래의 기재로부터 본 발명이 속한 기술 분야의 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to those mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 일 실시예에 따른 사고 검지 시스템의 동작 환경을 보여주는 도면이다.
도 2는 일 실시예에 따른 사고 검지 시스템의 구성을 보여주는 블록도이다.
도 3은 일 실시예에 따른 사고 검지 장치의 블록도이다.
도 4는 일 실시예에 따른 사고 검지 방법의 흐름도이다.
도 5는 일 실시예에 따른 서브 사운드를 획득하는 과정을 설명하기 위한 도면이다.
도 6은 일 실시예에 따른 입력 데이터 생성 과정의 흐름도이다.
도 7은 일 실시예에 따른 입력 데이터 생성을 위한 분류를 예시적으로 보여주는 그래프이다.
도 8은 일 실시예에 따른 사고음 학습 모델을 예시적으로 보여주는 도면이다.
도 9는 일 실시예에 따른 사고 발생 여부를 결정하는 과정을 설명하기 위한 도면이다.
도 10은 일 실시예에 따른 입력 데이터 생성 과정의 흐름도이다.
도 11은 일 실시예에 따른 입력 데이터 생성을 위해 분할된 행렬 데이터를 예시적으로 보여주는 도면이다.
도 12는 일 실시예에 따른 사고음 학습 모델을 예시적으로 보여주는 도면이다.
도 13은 일 실시예에 따른 사고 처리부의 동작 과정을 보여주는 도면이다.
도 14는 일 실시예에 따른 사고 검지 시스템의 데이터 흐름을 보여주는 도면이다.
도 15는 일 실시예에 따른 사고 검지 시스템의 사고 검지 결과를 예시적으로 보여주는 표이다.
도 16은 일 실시예에 따른 사고 검지 시스템의 사고 오검지 결과를 예시적으로 보여주는 표이다.1 is a view showing an operating environment of an accident detection system according to an embodiment.
2 is a block diagram showing the configuration of an accident detection system according to an embodiment.
3 is a block diagram of an accident detection device according to an embodiment.
4 is a flowchart of an accident detection method according to an embodiment.
5 is a diagram for explaining a process of acquiring a sub sound according to an exemplary embodiment.
6 is a flowchart of a process of generating input data according to an exemplary embodiment.
7 is a graph exemplarily illustrating classification for generating input data according to an embodiment.
8 is a diagram exemplarily illustrating a thought sound learning model according to an embodiment.
9 is a diagram for explaining a process of determining whether an accident has occurred, according to an exemplary embodiment.
10 is a flowchart of a process of generating input data according to an exemplary embodiment.
11 is a diagram exemplarily illustrating matrix data divided to generate input data according to an embodiment.
12 is a diagram exemplarily illustrating a thought sound learning model according to an embodiment.
13 is a diagram illustrating an operation process of an accident processing unit according to an exemplary embodiment.
14 is a diagram illustrating a data flow of an accident detection system according to an embodiment.
15 is a table exemplarily showing an accident detection result of an accident detection system according to an embodiment.
16 is a table exemplarily showing an accident erroneous detection result of the accident detection system according to an embodiment.

첨부된 도면을 참조하여 본 명세서에 개시된 실시예를 상세히 설명하되, 동일하거나 유사한 구성요소에는 동일유사한 도면 부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 또한, 본 명세서에 개시된 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.The embodiments disclosed herein will be described in detail with reference to the accompanying drawings, but the same or similar reference numerals are assigned to the same or similar components, and overlapping descriptions thereof will be omitted. In addition, in describing the embodiments disclosed in the present specification, if it is determined that detailed descriptions of related known technologies may obscure the gist of the embodiments disclosed in the present specification, the detailed description thereof will be omitted.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.The terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as "comprise" or "have" are intended to designate that a feature, number, step, operation, component, part, or a combination thereof described in the specification exists, but one or more other features It should be understood that this does not preclude the existence or addition of numbers, steps, operations, components, parts, or combinations thereof. Terms such as first, second, etc. may be used to describe various elements, but the elements should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another.

도 1은 일 실시예에 따른 사고 검지 시스템의 동작 환경을 보여주는 도면이다.1 is a view showing an operating environment of an accident detection system according to an embodiment.

실시예에 따른 사고 검지 시스템은 도로에서 수집한 사운드를 분석하여 사고 발생 여부를 판단하고, 사고의 종류 및 사고 위치와 같은 사고 정보를 제공할 수 있다.The accident detection system according to the embodiment may determine whether an accident has occurred by analyzing the sound collected from the road, and may provide accident information such as the type and location of the accident.

사고 검지 시스템은 사운드 수집기(10) 및 사고 검지 장치(100)를 포함한다.The accident detection system includes a sound collector 10 and an accident detection device 100 .

사운드 수집기(10)는 도로 상에 배치되어 도로에서 발생하는 입력 사운드를 획득할 수 있다. 일 예에서 하나 이상의 사운드 수집기(10)가 차량이 주행하는 도로를 따라 배치될 수 있다. 일 예에서 사운드 수집기(10)는 터널 내부에 여러 대 배치될 수 있다.The sound collector 10 may be disposed on a road to acquire an input sound generated on the road. In one example, one or more sound collectors 10 may be disposed along the road on which the vehicle travels. In one example, several sound collectors 10 may be disposed inside the tunnel.

사운드 수집기(10)는 수집된 입력 사운드를 사고 검지 장치(100)에 전송한다. 예를 들어 사운드 수집기(10)는 실시간으로 또는 주기적으로 입력 사운드를 사고 검지 장치(100)에 전송할 수 있다.The sound collector 10 transmits the collected input sound to the accident detection device 100 . For example, the sound collector 10 may transmit an input sound to the accident detection device 100 in real time or periodically.

예를 들어 도로 상의 임의의 위치(S)에서 사고가 발생한 경우, 사고 발생 위치(S)의 근방에 배치된 사운드 수집기(10)는 사고로 인해 발생한 사고음(예를 들어, 급제동음, 충격음, 폭발음 및 경적음 등)을 포함하는 입력 사운드를 감지(sense)하게 된다. 사고 발생 위치(S)의 근방에 배치된 사운드 수집기(10)는 사고음을 포함한 입력 사운드를 실시간으로 사고 검지 장치(100)에 전송할 수 있다. 일 예에서 사고 발생 위치(S)는 터널 내에 위치할 수 있다.For example, if an accident occurs at an arbitrary location S on the road, the sound collector 10 disposed in the vicinity of the accident location S may generate an accident sound (eg, sudden braking, impact sound, An input sound including an explosion sound, a horn sound, etc.) is sensed. The sound collector 10 disposed in the vicinity of the accident location S may transmit an input sound including the accident sound to the accident detection device 100 in real time. In one example, the accident location (S) may be located in the tunnel.

사고 검지 장치(100)는 사운드 수집기(10)로부터 수신한 입력 사운드를 분석하여 사고 발생 여부를 검지한다. 사고 검지 장치(100)의 구성에 대하여는 도 3을 참조하여 후술한다. The accident detection device 100 analyzes the input sound received from the sound collector 10 to detect whether an accident has occurred. The configuration of the accident detection device 100 will be described later with reference to FIG. 3 .

사고 검지 시스템은 운영 장치(200)를 더 포함할 수 있다. 사고 검지 장치(100)는 사고 검지 결과를 운영 장치(200)에게 전송하고, 운영 장치(200)는 사고 검지 장치(100)로부터 수신한 사고 검지 결과에 따라 필요한 정보를 운영자에게 제공한다.The accident detection system may further include an operating device 200 . The accident detection device 100 transmits the accident detection result to the operating device 200 , and the operating device 200 provides necessary information to the operator according to the accident detection result received from the accident detection device 100 .

사운드 수집기(10), 사고 검지 장치(100) 및 운영 장치(200)는 물리적으로 떨어진 다른 장소에 위치할 수 있다. 예를 들어 사운드 수집기(10)는 차량이 주행하는 도로 상에 터널 내부와 같은 장소에 배치될 수 있다. 사고 검지 장치(100)는 사운드 수집기(10)로부터 원격의 위치에 배치되어 여러 대의 사운드 수집기(10)로부터 수집한 입력 사운드를 분석할 수 있다. 예를 들어 사고 검지 장치(100)는 터널 외부 또는 관제소에 배치될 수 있다. 운영 장치(200)는 중앙 관제 서버로서 중앙 관제소에 위치할 수 있다. 운영 장치(200)는 사고 검지 장치(100)로부터 수신한 사고 검지 결과에 따라 대응 조치를 결정하고, 필요한 정보를 운영자에게 제공할 수 있다.The sound collector 10 , the accident detection device 100 , and the operating device 200 may be located in other locations that are physically separated from each other. For example, the sound collector 10 may be disposed on a road on which the vehicle travels, such as inside a tunnel. The accident detecting device 100 may be disposed at a location remote from the sound collector 10 to analyze input sounds collected from a plurality of sound collectors 10 . For example, the accident detection device 100 may be disposed outside the tunnel or at a control center. The operating device 200 may be located in the central control center as a central control server. The operation device 200 may determine a response action according to the accident detection result received from the accident detection device 100 , and provide necessary information to the operator.

도 2는 일 실시예에 따른 사고 검지 시스템의 구성을 보여주는 블록도이다.2 is a block diagram showing the configuration of an accident detection system according to an embodiment.

도 1을 참조하여 전술한대로 사고 검지 시스템은 사운드 수집기(10) 및 사고 검지 장치(100)를 포함할 수 있다.As described above with reference to FIG. 1 , the accident detection system may include a sound collector 10 and an accident detection device 100 .

사운드 수집기(100)는 도로 상에 배치되어 도로에서 발생하는 소리를 수집한다. 사운드 수집기(10)는 마이크로폰, 오디오 코덱, 마이크로프로세서 및 이더넷 컨트롤러를 포함할 수 있다.The sound collector 100 is disposed on the road and collects sounds generated on the road. The sound collector 10 may include a microphone, an audio codec, a microprocessor, and an Ethernet controller.

마이크로폰은 도로에서 발생하는 사운드를 전기적 신호로 변환할 수 있다. 오디오 코덱은 마이크로폰에서 수집한 사운드의 아날로그 신호를 디지털 데이터로 변환할 수 있다.Microphones can convert sound from the road into electrical signals. The audio codec can convert the analog signal of the sound collected by the microphone into digital data.

사운드 수집기는 오디오 코덱에서 변환된 디지털 데이터를 사고분석부(100)에 전달하고 사고분석부(100)는 전달된 디지털 데이터를 분석한다.The sound collector transmits the digital data converted by the audio codec to the accident analysis unit 100, and the accident analysis unit 100 analyzes the transmitted digital data.

마이크로프로세서는 사운드 수집기(10)에 내장된 주변칩을 제어하며 주변칩으로부터 데이터를 수집하고, 계산을 수행할 수 있다. 또한 마이크로프로세서는 오디오 코덱으로부터 받은 음향 디지털 데이터를 수신하고 내장 메모리에 저장할 수 있다. 또한, 마이크로프로세서는 이더넷 컨트롤러를 제어하여 사고 검지 장치(100)와의 통신을 수행할 수 있다. 다시 말하면, 마이크로프로세서는 음향수집기(10)의 CPU 역할을 수행할 수 있다.The microprocessor may control a peripheral chip embedded in the sound collector 10, collect data from the peripheral chip, and perform calculations. The microprocessor may also receive acoustic digital data received from the audio codec and store it in the internal memory. In addition, the microprocessor may control the Ethernet controller to communicate with the accident detection device 100 . In other words, the microprocessor may serve as a CPU of the sound collector 10 .

이더넷 컨트롤러는 TCP/IP로 사운드 수집기(10)와 사고 검지 장치(100), 예를 들어 사고분석부(100a)와의 통신을 담당한다. 또한 이더넷 컨트롤러는 사운드 수집기(10)와 외부 서버, 예를 들어 운영 장치(200)와의 통신을 가능하게 한다.The Ethernet controller is in charge of communication with the sound collector 10 and the accident detection device 100, for example, the accident analysis unit 100a through TCP/IP. In addition, the Ethernet controller enables communication between the sound collector 10 and an external server, for example, the operating device 200 .

사고 검지 장치(100)는 입력 사운드에 기반하여 사고 여부를 판별하는 사고 분석부(100a)를 포함한다. 사고 분석부(100a)는 사운드 수집기(10)로부터 수신한 입력 사운드를 가공 및 분석하여 사고 검지 결과를 결정한다. 여기서 입력 사운드는 사운드 수집기(10)의 마이크로폰을 통해 수집된 사운드의 아날로그 신호를 사운드 수집기(10)의 오디오 코덱에서 디지털 정보로 변환한 것에 대응한다.The accident detection device 100 includes an accident analysis unit 100a that determines whether an accident has occurred based on the input sound. The accident analysis unit 100a processes and analyzes the input sound received from the sound collector 10 to determine the accident detection result. Here, the input sound corresponds to the conversion of the analog signal of the sound collected through the microphone of the sound collector 10 into digital information in the audio codec of the sound collector 10 .

사고분석부(100a)는, 입력 사운드를 가공하여 획득한 입력 데이터를 사고음 학습 모델에 제공하여 사고 발생 여부, 즉 사고 검지 결과를 결정할 수 있다. 사고 분석부(100a)는 사고 검지 결과를 사고 처리부(100b) 및/또는 운영 장치(200)에 전송할 수 있다.The accident analysis unit 100a may determine whether an accident has occurred, that is, an accident detection result by providing input data obtained by processing the input sound to the accident sound learning model. The accident analysis unit 100a may transmit the accident detection result to the accident processing unit 100b and/or the operation device 200 .

사고 검지 장치(100)는 사고 분석부(100a)의 사고 검지 결과에 따라 사고 대응 방안을 제공하고 수행하는 사고 처리부(100b)를 더 포함할 수 있다. 사고 처리부(100b)는 사고 분석부(100a)의 사고 검지 결과에 기반하여 대응 조치를 결정 및 수행할 수 있다. 사고 처리부(100b)의 동작에 대하여는 도 13을 참조하여 후술한다.The accident detection device 100 may further include an accident processing unit 100b that provides and performs an accident response plan according to the accident detection result of the accident analysis unit 100a. The accident processing unit 100b may determine and perform a response action based on the accident detection result of the accident analysis unit 100a. The operation of the accident processing unit 100b will be described later with reference to FIG. 13 .

한편, 사고 분석부(100a)는 외부 서버와의 연계 인터페이스를 통해 사고 검지 결과를 운영 장치(200)에게 전달할 수 있다. 여기서 외부 서버는 사고 검지 시스템에 대한 모니터링 및 운영 서비스를 제공하는 웹 서버를 포함한다. 운영 장치(200)는 이와 같은 외부 서버에 접속하여 사고 검지 시스템에 대한 모니터링 및 운영 서비스를 제공받는 단말 장치를 의미한다. 여기서 단말 장치는, 예를 들어 데스크탑 컴퓨터, 스마트폰, 태블릿 및 랩탑 컴퓨터와 같이 웹 어플리케이션을 구동할 수 있는 다양한 종류의 전자 기기를 포함할 수 있다.Meanwhile, the accident analysis unit 100a may transmit the accident detection result to the operation device 200 through an interface with an external server. Here, the external server includes a web server that provides monitoring and operation services for the accident detection system. The operating device 200 refers to a terminal device that accesses such an external server and receives monitoring and operating services for the accident detection system. Here, the terminal device may include various types of electronic devices capable of running a web application, such as, for example, a desktop computer, a smart phone, a tablet, and a laptop computer.

사고 분석부(100a) 및 사고 처리부(100b)는 사고 검지 장치(100)의 기능적 측면에서 사고 검지 장치(100)를 개략적으로 설명한 것이다. 이하에서 도 3을 참조하여 사고 검지 장치(100)의 구성에 대하여 보다 구체적으로 살펴본다.The accident analysis unit 100a and the accident processing unit 100b schematically describe the accident detection device 100 in terms of the functional aspects of the accident detection device 100 . Hereinafter, the configuration of the accident detection device 100 will be described in more detail with reference to FIG. 3 .

도 3은 일 실시예에 따른 사고 검지 장치의 블록도이다.3 is a block diagram of an accident detection device according to an embodiment.

사고 검지 장치(100)는 원격 위치(S)에서 발생한 사고를 검지할 수 있다. 원격 위치(S)는 사고 검지 장치(100)가 위치한 곳과 거리가 있는 다른 장소일 수 있다. 원격 위치(S)에서 사고가 발생한 경우, 원격 위치(S)의 근방에 배치된 사운드 수집기(10)는 사고음을 포함한 입력 사운드를 획득하고, 획득한 입력 사운드를 사고 검지 장치(100)로 전송한다.The accident detection device 100 may detect an accident that occurred in the remote location (S). The remote location S may be another location that is far from a location where the accident detection device 100 is located. When an accident occurs at the remote location S, the sound collector 10 disposed in the vicinity of the remote location S acquires an input sound including an accident sound, and transmits the acquired input sound to the accident detection device 100 do.

사고 검지 장치(100)는 통신부(110), 저장부(120) 및 프로세서(130)를 포함할 수 있다. 도 3에 도시된 구성들은 예시적인 것이고, 사고 검지 장치(100)는 도 3에 도시된 구성 중 일부를 포함하거나 또는 추가적인 구성을 더 포함할 수 있다.The accident detection device 100 may include a communication unit 110 , a storage unit 120 , and a processor 130 . The components shown in FIG. 3 are exemplary, and the accident detection device 100 may include some of the components shown in FIG. 3 or further include additional components.

통신부(110)는 사운드 수집기(10) 및 외부 서버를 포함한 외부 장치와 데이터 통신을 수행한다. 사고 검지 장치(100)는 통신부(110)를 통해 데이터를 외부 장치에 전송하고 외부 장치로부터 데이터를 수신할 수 있다. 일 예에서 통신부(110)는 TCP/IP와 같은 통신 프로토콜을 이용하여 데이터를 송수신하기 위한 소프트웨어 및 하드웨어 모듈을 포함할 수 있다.The communication unit 110 performs data communication with an external device including the sound collector 10 and an external server. The accident detection device 100 may transmit data to and receive data from the external device through the communication unit 110 . In an example, the communication unit 110 may include software and hardware modules for transmitting and receiving data using a communication protocol such as TCP/IP.

저장부(120)는 사고 검지를 위해 필요한 입력 데이터, 중간 데이터 및 검지 결과 등을 저장한다. 저장부(120)는 메모리 버퍼(121), 데이터베이스(122) 및 모델 저장부(123)를 포함할 수 있다.The storage unit 120 stores input data, intermediate data, and detection results required for accident detection. The storage unit 120 may include a memory buffer 121 , a database 122 , and a model storage unit 123 .

메모리 버퍼(121)는 사운드 수집기(10)로부터 수신한 입력 사운드의 스트림을 저장하기 위하여 할당된 메모리 공간을 의미한다. 사운드 수집기(10)에 의해 수집된 입력 사운드는 사고 검지 장치(100)에 실시간으로 전달된다. 예를 들어 사운드 수집기(10)는 수집된 입력 사운드를 사고 검지 장치(100)에게 실시간 스트리밍 방식으로 전달할 수 있다. 사고 검지 장치(100)는 사운드 수집기(10)로부터 수신한 입력 사운드의 스트림을 메모리 버퍼(121)에 저장할 수 있다.The memory buffer 121 means a memory space allocated to store a stream of input sound received from the sound collector 10 . The input sound collected by the sound collector 10 is transmitted to the accident detection device 100 in real time. For example, the sound collector 10 may transmit the collected input sound to the accident detection device 100 in a real-time streaming manner. The accident detection device 100 may store the stream of the input sound received from the sound collector 10 in the memory buffer 121 .

사고 검지 장치(100)는 메모리 버퍼(121)에 저장된 입력 사운드의 스트림을 일정 단위로 연속적으로 읽어들이고, 사고음 학습 모델에 입력값으로 제공할 입력 데이터를 생성할 수 있다. 사고 검지 장치(100)가 입력 데이터 생성을 위해 메모리 버퍼(121)로부터 입력 사운드의 스트림을 읽어들인 단위만큼 후속 스트림이 메모리 버퍼(121)에 채워질 수 있다. 일 예에서 메모리 버퍼(121)는 FIFO(First In First Out) 방식으로 작동할 수 있다.The accident detection apparatus 100 may continuously read the input sound stream stored in the memory buffer 121 in a predetermined unit and generate input data to be provided as an input value to the accident sound learning model. A subsequent stream may be filled in the memory buffer 121 by the unit in which the accident detection device 100 reads the stream of the input sound from the memory buffer 121 to generate the input data. In an example, the memory buffer 121 may operate in a first in first out (FIFO) manner.

데이터베이스(122)는 사운드 수집기(10)로부터 수신한 입력 사운드 데이터를 인덱싱하여 저장한다. 여기서 입력 사운드 데이터는 오디오 파일을 포함할 수 있다. 데이터베이스(122)는 외부 서버와 연동하여 외부 서버에 접속한 운영 장치(200)에서 전송한 요청에 대한 응답으로 데이터베이스(120)에 저장된 입력 사운드 데이터를 외부 서버를 통해 운영 장치(200)에 제공할 수 있다.The database 122 indexes and stores the input sound data received from the sound collector 10 . Here, the input sound data may include an audio file. The database 122 provides input sound data stored in the database 120 to the operating device 200 through the external server in response to a request transmitted from the operating device 200 connected to the external server by interworking with the external server. can

모델 저장부(123)는 사고 검지 장치(100)가 사고 검지를 위해 실행하는 사고음 학습 모델을 저장할 수 있다. 사고음 학습 모델은 인공 신경망에 기반한 학습 모델을 포함할 수 있다. 예를 들어 사고음 학습 모델은 DNN(Deep Neural Network) 또는 RNN(Recurrent Neural Network)에 기반한 딥러닝 알고리즘을 이용하여 구현될 수 있다.The model storage unit 123 may store an accident sound learning model that the accident detection device 100 executes for accident detection. The thinking sound learning model may include a learning model based on an artificial neural network. For example, the thought sound learning model may be implemented using a deep learning algorithm based on a deep neural network (DNN) or a recurrent neural network (RNN).

프로세서(130)는 사고 검지 장치(100)의 중앙 연산 장치(CPU: Central Processing Unit)에 대응한다. 프로세서(130)는 하나 이상의 프로세서를 포함할 수 있다.The processor 130 corresponds to a central processing unit (CPU) of the accident detection device 100 . Processor 130 may include one or more processors.

프로세서(130)는 사고 분석부(100a)의 동작을 제어할 수 있다.The processor 130 may control the operation of the accident analysis unit 100a.

프로세서(130)는 입력 사운드의 스트림으로부터 일련의 서브 사운드를 획득하고, 획득한 일련의 서브 사운드에 기반하여 사고음 학습 모델을 실행하기 위한 일련의 입력 데이터를 생성하고, 생성된 일련의 입력 데이터에 기반한 사고음 학습 모델의 실행 결과에 따라 입력 사운드가 사고음을 포함하는 지 여부에 대한 검지 결과를 결정하도록 구성될 수 있다.The processor 130 obtains a series of sub sounds from a stream of input sounds, generates a series of input data for executing a thought sound learning model based on the obtained series of sub sounds, and adds to the generated series of input data. It may be configured to determine a detection result as to whether the input sound includes an accident sound according to the execution result of the accident sound learning model based on the accident sound.

프로세서(130)는 메모리 버퍼(121)에 저장된 입력 사운드의 스트림으로부터 일련의 서브 사운드를 획득하도록 구성될 수 있다. 이를 위하여 프로세서(130)는 입력 사운드의 스트림을 소정의 시간 길이에 따라 분할하여 일련의 서브 사운드를 획득할 수 있다. 서브 사운드 획득 과정에 대하여는 도 5를 참조하여 구체적으로 후술하기로 한다.The processor 130 may be configured to obtain a series of sub sounds from a stream of input sounds stored in the memory buffer 121 . To this end, the processor 130 may obtain a series of sub sounds by dividing the stream of the input sound according to a predetermined length of time. The sub sound acquisition process will be described later in detail with reference to FIG. 5 .

일 예에서 일련의 서브 사운드는 제 1 서브 사운드 및 제 1 서브 사운드에 후속한 제 2 서브 사운드를 포함하고, 프로세서(130)는 입력 사운드의 스트림으로부터 제 1 서브 사운드의 적어도 일부와 오버래핑하는 제 2 서브 사운드를 획득할 수 있다.In one example the series of sub sounds includes a first sub sound and a second sub sound subsequent to the first sub sound, and the processor 130 is configured to generate a second sub sound overlapping at least a portion of the first sub sound from the stream of input sound. You can get a sub sound.

일 예에서 일련의 서브 사운드는 제 1 서브 사운드 및 제 1 서브 사운드에 후속한 제 2 서브 사운드를 포함하고, 프로세서(130)는 입력 사운드의 스트림에서 제 1 서브 사운드가 끝나는 지점에서부터 제 2 서브 사운드를 획득할 수 있다.In one example, the series of sub sounds includes a first sub sound and a second sub sound subsequent to the first sub sound, and the processor 130 is configured to: can be obtained.

프로세서(130)는 획득한 일련의 서브 사운드에 기반하여 사고음 학습 모델을 실행하기 위한 일련의 입력 데이터를 생성하도록 구성될 수 있다. 이를 위하여 프로세서(130)는 일련의 서브 사운드 중 현재 서브 사운드를 소정의 주파수 대역에 대한 일 세트의 주파수 성분값으로 변환하고, 일 세트의 주파수 성분값에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성할 수 있다. 일 예에서 프로세서(130)는 고속 푸리에 변환(FFT: Fast Fourier Transform)을 통해 현재 서브 사운드를 소정의 주파수 대역에 대한 일 세트의 주파수 성분값으로 변환할 수 있다.The processor 130 may be configured to generate a series of input data for executing the thought sound learning model based on the obtained series of sub sounds. To this end, the processor 130 converts the current sub sound among a series of sub sounds into a set of frequency component values for a predetermined frequency band, and generates input data corresponding to the current sub sound based on the set of frequency component values. can create In an example, the processor 130 may convert the current sub sound into a set of frequency component values for a predetermined frequency band through Fast Fourier Transform (FFT).

일 예에서 프로세서(130)는 일 세트의 주파수 성분값에 대한 제 1 배열 및 기저 사운드의 주파수 성분값에 대한 제 2 배열 간의 차이에 따라 제 1 세트의 분류값을 결정하고, 제 1 배열 및 제 2 배열의 각 주파수 성분값의 변화를 정규화하여 제 2 세트의 분류값을 결정하고, 제 1 세트의 분류값 및 제 2 세트의 분류값 중 적어도 하나에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성할 수 있다. 여기서 기저 사운드는 일련의 서브 사운드 중 현재 서브 사운드 보다 이전의 적어도 하나의 서브 사운드를 포함할 수 있다. 이에 대하여는 도 6 및 도 7을 참조하여 후술한다.In one example, the processor 130 determines the first set of classification values according to a difference between the first arrangement for the set of frequency component values and the second arrangement for the frequency component values of the base sound, the first arrangement and the second arrangement 2 Normalizing the change of each frequency component value of the array to determine a second set of classification values, and determining the input data corresponding to the current sub sound based on at least one of the first set of classification values and the second set of classification values can create Here, the base sound may include at least one sub sound that is earlier than the current sub sound among a series of sub sounds. This will be described later with reference to FIGS. 6 and 7 .

일 예에서 프로세서(130)는 일 세트의 주파수 성분값에 대한 특징값의 행렬을 생성하고, 특징값의 행렬을 소정의 시간 구간에 따라 분할하여 생성된 복수 개의 서브 행렬에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성할 수 있다. 이에 대하여는 도 10 및 도 11을 참조하여 후술한다.In an example, the processor 130 generates a matrix of feature values for a set of frequency component values, and divides the matrix of feature values according to a predetermined time interval to apply the current sub sound based on a plurality of sub matrices generated. Corresponding input data may be generated. This will be described later with reference to FIGS. 10 and 11 .

한편, 프로세서(130)는 사고음 학습 모델의 실행 결과를 복수 개 획득하고, 복수 개의 실행 결과에 기반하여 검지 결과를 결정할 수 있다. 이에 대하여는 도 9를 참조하여 구체적으로 살펴본다.On the other hand, the processor 130 may obtain a plurality of execution results of the accident sound learning model, and determine the detection result based on the plurality of execution results. This will be described in detail with reference to FIG. 9 .

프로세서(130)는 사고 처리부(100b)의 동작을 제어할 수 있다. 예를 들어 프로세서(130)는 사고음 학습 모델을 이용한 검지 결과에 기반하여 원격 위치의 근방에 설치된 카메라를 제어하기 위한 명령을 생성하도록 더 구성될 수 있다. 또한, 프로세서(130)는 사고음 학습 모델을 이용한 검지 결과에 기반하여 사고 정보에 대한 알림 메시지를 생성하도록 더 구성될 수 있다. 이에 대하여는 도 13을 참조하여 후술한다.The processor 130 may control the operation of the accident processing unit 100b. For example, the processor 130 may be further configured to generate a command for controlling a camera installed in the vicinity of a remote location based on a detection result using the accident sound learning model. In addition, the processor 130 may be further configured to generate a notification message for accident information based on the detection result using the accident sound learning model. This will be described later with reference to FIG. 13 .

추가적으로 사고 검지 장치(100)는 러닝 프로세서(미도시)를 더 포함할 수 있다. 러닝 프로세서는, 프로세서(130)와 연계하여 혹은 프로세서(130)와 독립적으로, 모델 저장부(123)에 저장된 사고음 학습 모델을 학습시키거나 사고음 학습 모델을 이용하여 사고 발생 여부를 추론하기 위한 연산을 수행할 수 있다.Additionally, the accident detection device 100 may further include a running processor (not shown). The learning processor, in conjunction with the processor 130 or independently of the processor 130, learns the accident sound learning model stored in the model storage unit 123 or uses the accident sound learning model to infer whether an accident has occurred. operation can be performed.

도 4는 일 실시예에 따른 사고 검지 방법의 흐름도이다.4 is a flowchart of an accident detection method according to an embodiment.

사고 검지 장치(100)는 사운드 수집기(10)로부터 획득한 입력 사운드에 기반하여 사고 발생 여부를 판단하는 사고 검지 방법을 수행할 수 있다.The accident detection apparatus 100 may perform an accident detection method of determining whether an accident has occurred based on the input sound acquired from the sound collector 10 .

실시예에 따른 사고 검지 방법은, 원격 위치에서 발생한 사고를 검지하기 위하여, 원격 위치의 근방에 배치된 사고음 수집기(10)를 통해 감지한 입력 사운드의 스트림을 획득하는 단계(S310), 입력 사운드의 스트림으로부터 일련의 서브 사운드를 획득하는 단계(S320), 일련의 서브 사운드에 기반하여 사고음 학습 모델을 실행하기 위한 일련의 입력 데이터를 생성하는 단계(S330) 및 일련의 입력 데이터에 기반한 사고음 학습 모델의 실행 결과에 따라 입력 사운드가 사고음을 포함하는 지 여부에 대한 검지 결과를 결정하는 단계(S340)를 포함할 수 있다.The accident detection method according to the embodiment includes the steps of obtaining a stream of an input sound sensed through the accident sound collector 10 disposed in the vicinity of the remote location in order to detect an accident occurring at a remote location (S310), the input sound Acquiring a series of sub sounds from the stream of (S320), generating a series of input data for executing a thought sound learning model based on the series of sub sounds (S330), and thinking sounds based on the series of input data It may include a step (S340) of determining a detection result as to whether the input sound includes an accidental sound according to the execution result of the learning model.

단계(S310)에서, 도 3을 참조하여, 프로세서(130)는 원격 위치(S)의 근방에 배치된 사고음 수집기(10)를 통해 감지한 입력 사운드의 스트림을 획득할 수 있다.In step S310 , with reference to FIG. 3 , the processor 130 may acquire a stream of an input sound sensed through the accident sound collector 10 disposed in the vicinity of the remote location S.

일 예에서 원격 위치(S)는 도로 상의 임의의 위치일 수 있다. 일 예에서 사운드 수집기(10)는 터널 내부에 배치되어 터널 내부에서 발생하는 소리를 수집할 수 있다. 사고가 터널 내부에서 발생한 경우, 사고 발생 위치, 즉 원격 위치(S)는 터널 내부에 위치하고, 원격 위치(S)의 근방에 배치된 사운드 수집기(10)는 해당 사고로 발생한 사고음을 포함하는 입력 사운드를 획득하게 된다. 사운드 수집기(10)는 획득한 입력 사운드를 사고 검지 장치(100)에게 전송한다. 예를 들어 사운드 수집기(10)는 실시간 스트리밍 방식으로 입력 사운드의 스트림을 사고 검지 장치(100)에게 전송할 수 있다.In one example, the remote location S may be any location on the road. In one example, the sound collector 10 may be disposed inside the tunnel to collect sounds generated inside the tunnel. When the accident occurs inside the tunnel, the location of the accident, that is, the remote location S is located inside the tunnel, and the sound collector 10 disposed in the vicinity of the remote location S is an input including the accident sound generated by the accident. get sound. The sound collector 10 transmits the acquired input sound to the accident detection device 100 . For example, the sound collector 10 may transmit a stream of the input sound to the accident detection device 100 in a real-time streaming manner.

단계(S310)에서 프로세서(130)는, 통신부(110)를 통해, 원격 위치(S)의 근방에 배치된 사고음 수집기(10)로부터 입력 사운드의 스트림을 수신하고, 메모리 버퍼(121)에 저장한다. 이에 후속하여 또는 이와 동시에 프로세서(130)는, 메모리 버퍼(121)로부터 입력 사운드의 스트림을 연속적으로 읽어들일 수 있다.In step S310 , the processor 130 receives, through the communication unit 110 , a stream of input sound from the accident sound collector 10 disposed in the vicinity of the remote location S, and stores it in the memory buffer 121 . do. Subsequently or concurrently with this, the processor 130 may continuously read the stream of the input sound from the memory buffer 121 .

단계(S320)에서 프로세서(130)는 단계(S310)에서 획득한 입력 사운드의 스트림으로부터 일련의 서브 사운드를 획득할 수 있다.In step S320, the processor 130 may obtain a series of sub sounds from the stream of the input sound obtained in step S310.

단계(S320)에서 프로세서(130)는 단계(S310)에서 획득한 입력 사운드의 스트림을 소정의 시간 길이에 따라 분할하여 일련의 서브 사운드를 획득할 수 있다. 여기서 소정의 시간 길이는 사고 검지 장치(100)의 환경 변수로 설정가능하다. 예를 들어 소정의 시간 길이는 0.1 초 이하일 수 있다. 예를 들어 소정의 시간 길이는 10초 이상일 수 있다. 일 예에서 RNN에 기반한 사고음 학습 모델을 위한 서브 사운드는 DNN에 기반한 사고음 학습 모델을 위한 서브 사운드보다 더 긴 길이로 생성될 수 있으나 이에 한정되는 것은 아니다. 소정의 시간 길이는 사고 검지 장치(100)의 설치 장소, 설치 환경 및 소음도 등의 요인에 따라 조정가능하다. 이하에서 도 5를 참조하여 단계(S320)을 살펴본다.In step S320, the processor 130 may obtain a series of sub sounds by dividing the stream of the input sound obtained in step S310 according to a predetermined length of time. Here, the predetermined length of time can be set as an environmental variable of the accident detection device 100 . For example, the predetermined length of time may be 0.1 seconds or less. For example, the predetermined length of time may be 10 seconds or more. In an example, the sub sound for the thinking sound learning model based on RNN may be generated with a longer length than the sub sound for the thought sound learning model based on the DNN, but is not limited thereto. The predetermined length of time can be adjusted according to factors such as the installation location of the accident detection device 100, the installation environment, and the noise level. Hereinafter, step S320 will be described with reference to FIG. 5 .

도 5는 일 실시예에 따른 서브 사운드를 획득하는 과정을 설명하기 위한 도면이다.5 is a diagram for explaining a process of acquiring a sub sound according to an exemplary embodiment.

프로세서(130)는 메모리 버퍼(121)에 저장된 입력 사운드의 스트림을 소정의 시간 길이에 따라 분할하여 일련의 서브 사운드를 획득할 수 있다. 일련의 서브 사운드는 제 1 서브 사운드 및 제 1 서브 사운드에 후속한 제 2 서브 사운드를 포함할 수 있다.The processor 130 may acquire a series of sub sounds by dividing the stream of the input sound stored in the memory buffer 121 according to a predetermined length of time. The series of sub sounds may include a first sub sound and a second sub sound following the first sub sound.

일 예에서 프로세서(130)는 메모리 버퍼(121)에 저장된 입력 사운드의 스트림으로부터 제 1 서브 사운드의 적어도 일부와 오버래핑하는 제 2 서브 사운드를 획득할 수 있다.In an example, the processor 130 may obtain a second sub sound that overlaps with at least a part of the first sub sound from the stream of the input sound stored in the memory buffer 121 .

도 5를 참조하여 예시적으로 설명하면, 도 5의 좌측 박스에서 제 1 시점(T1)의 예시적인 메모리 버퍼(121)에는 입력 사운드의 스트림(IN_STRM)의 적어도 일부가 저장되어 있다. 프로세서(130)는 메모리 버퍼(121)에 저장된 입력 사운드의 스트림(IN_STRM)을 소정의 시간 길이에 따라 제 1 블록(B1), 제 2 블록(B2), 제 3 블록(B3) 및 제 4 블록(B4)으로 분할할 수 있다. 예를 들어 소정의 시간 길이는 0.05초일 수 있다. 일 예에서 프로세서(130)는 제 1 블록(B1) 및 제 2 블록(B2)을 묶어서 제 1 서브 사운드를 획득할 수 있다. 일 예에서 프로세서(130)는 두 개 이상의 블록(예를 들어 B1, B2 및 B3)을 묶어서 제 1 서브 사운드(SUB_S1)를 획득할 수 있다. 즉, 제 1 서브 사운드(SUB_S1)의 시간 길이는 0.1초일 수 있다.Referring to FIG. 5 , at least a portion of the input sound stream IN_STRM is stored in the exemplary memory buffer 121 of the first time point T1 in the left box of FIG. 5 . The processor 130 converts the input sound stream IN_STRM stored in the memory buffer 121 according to a predetermined time length to the first block B1, the second block B2, the third block B3, and the fourth block. It can be divided into (B4). For example, the predetermined length of time may be 0.05 seconds. In one example, the processor 130 may obtain the first sub sound by bundling the first block B1 and the second block B2. In an example, the processor 130 may obtain the first sub sound SUB_S1 by bundling two or more blocks (eg, B1, B2, and B3). That is, the time length of the first sub sound SUB_S1 may be 0.1 seconds.

서브 사운드의 크기는 입력 사운드의 샘플링 레이트(sampling rate)에 따라 변할 수 있다. 예를 들어 1초에 48000 바이트를 샘플링 한다면, 0.1초 길이의 서브 사운드는 4800바이트의 크기를 가진다.The size of the sub sound may change according to a sampling rate of the input sound. For example, if 48000 bytes are sampled per second, a sub sound with a length of 0.1 seconds has a size of 4800 bytes.

도 5의 우측 박스는 제 1 시점(T1)에서 제 1 서브 사운드(SUB_S1)을 생성한 이후의 제 2 시점(T2)의 예시적인 메모리 버퍼(121)를 보여준다. 프로세서(130)는 입력 사운드의 스트림(IN_STRM)으로부터 제 1 서브 사운드(SUB_S1)의 적어도 일부와 오버래핑하는 제 2 서브 사운드(SUB_S2)를 획득할 수 있다. 도 5에 도시된 예시에서 제 2 서브 사운드(SUB_S2)는 제 2 블록(B2) 및 제 3 블록(B3)을 묶은 것에 대응하며, 제 2 서브 사운드(SUB_S2)와 제 1 서브 사운드(SUB_S1)은 제 2 블록(B2)에서 서로 오버래핑하게 된다. 다시 말하면, 제 1 블록(B1) 및 제 2 블록(B2)를 포함하는 제 1 서브 사운드(SUB_S1)을 생성한 이후에 제 1 블록(B1)은 메모리 버퍼(121)에서 삭제되고, 제 2 블록(B2)은 후속하는 제 2 서브 사운드(SUB_S2)의 생성을 위해 메모리 버퍼(121)에 보존될 수 있다.The right box of FIG. 5 shows an exemplary memory buffer 121 at a second time point T2 after the first sub sound SUB_S1 is generated at the first time point T1 . The processor 130 may obtain the second sub sound SUB_S2 overlapping at least a part of the first sub sound SUB_S1 from the input sound stream IN_STRM. In the example shown in FIG. 5 , the second sub sound SUB_S2 corresponds to bundling the second block B2 and the third block B3, and the second sub sound SUB_S2 and the first sub sound SUB_S1 are In the second block B2, they overlap each other. In other words, after generating the first sub sound SUB_S1 including the first block B1 and the second block B2, the first block B1 is deleted from the memory buffer 121, and the second block (B2) may be stored in the memory buffer 121 for the subsequent generation of the second sub sound SUB_S2.

일 예에서 프로세서(130)는 메모리 버퍼(121)에 저장된 입력 사운드의 스트림에서 제 1 서브 사운드가 끝나는 지점에서부터 제 2 서브 사운드를 획득할 수 있다.In an example, the processor 130 may acquire the second sub sound from a point where the first sub sound ends in the stream of the input sound stored in the memory buffer 121 .

다시 말하면, 프로세서(130)는 제 1 서브 사운드(SUB_S1)와 제 2 서브 사운드(SUB_S2) 간에 오버래핑이 없도록 제 1 서브 사운드(SUB_S1)에 후속하여 제 2 서브 사운드(SUB_S2)를 획득할 수 있다.In other words, the processor 130 may acquire the second sub sound SUB_S2 following the first sub sound SUB_S1 so that there is no overlapping between the first sub sound SUB_S1 and the second sub sound SUB_S2 .

예를 들어 도 5를 참조하여, 프로세서(130)는 제 1 서브 사운드(SUB_S1)는 제 1 블록(B1) 및 제 2 블록(B2)를 포함하고, 제 2 서브 사운드(SUB_S2)는 제 3 블록(B3) 및 제 4 블록(B4)를 포함하도록 생성될 수 있다.For example, referring to FIG. 5 , in the processor 130 , the first sub sound SUB_S1 includes a first block B1 and a second block B2, and the second sub sound SUB_S2 includes a third block It may be generated to include (B3) and a fourth block (B4).

다시 도 4로 돌아와서, 사고 검지 장치(100)는 단계(S320)에서 일련의 서브 사운드를 획득하고, 다음 단계(S330)를 수행한다.Returning to FIG. 4 again, the accident detection device 100 acquires a series of sub sounds in step S320 and performs the next step S330.

단계(S330)에서 프로세서(130)는 단계(S320)에서 획득한 일련의 서브 사운드에 기반하여 사고음 학습 모델을 실행하기 위한 일련의 입력 데이터를 생성할 수 있다.In step S330 , the processor 130 may generate a series of input data for executing the thought sound learning model based on the series of sub sounds obtained in step S320 .

실시예에 따른 사고음 학습 모델은 DNN 또는 RNN에 기반한 딥 러닝 알고리즘을 통해 구현될 수 있다. 이하에서는, 인공 지능에 대해 개략적으로 설명한다.The thought sound learning model according to the embodiment may be implemented through a deep learning algorithm based on DNN or RNN. Hereinafter, artificial intelligence will be briefly described.

인공 지능(AI: Artificial Intelligence)은 스스로 사고 및 판단할 수 있는 인공적인 지능 또는 이를 만들기 위한 기술로서, 머신 러닝(기계 학습, Machine Learning)은 컴퓨터가 학습할 수 있도록 하는 알고리즘과 기술을 개발하는 인공 지능의 한 분야를 뜻한다.Artificial intelligence (AI) is an artificial intelligence that can think and make decisions on its own or a technology to make it. It refers to a field of intelligence.

인공 신경망(ANN: Artificial Neural Network)은 머신 러닝에서 사용되는 모델로서, 시냅스의 결합으로 네트워크를 형성한 인공 뉴런(노드)들로 구성될 수 있다. 인공 신경망은 입력층(Input Layer), 출력층(Output Layer) 및 선택적으로 하나 이상의 은닉층(Hidden Layer)을 포함할 수 있다. 인공 신경망은 다른 레이어의 뉴런들 사이의 연결 패턴, 모델 파라미터를 갱신하는 학습 과정, 출력값을 생성하는 활성화 함수(Activation Function)에 의해 정의될 수 있다. 인공 신경망에서 각 뉴런은 시냅스를 통해 입력되는 입력 신호들, 가중치, 편향 등에 대한 활성 함수의 함숫값을 출력할 수 있다. 인공 신경망에서 학습은 손실 함수를 최소화하는 최적의 모델 파라미터를 결정하는 것이다. 인공 신경망 중에서 복수의 은닉층을 포함하는 심층 신경망(DNN: Deep Neural Network)으로 구현되는 머신 러닝을 딥 러닝(심층 학습, Deep Learning)이라 한다.An artificial neural network (ANN) is a model used in machine learning, and may be composed of artificial neurons (nodes) that form a network by combining synapses. The artificial neural network may include an input layer, an output layer, and optionally one or more hidden layers. An artificial neural network may be defined by a connection pattern between neurons of different layers, a learning process that updates model parameters, and an activation function that generates an output value. In the artificial neural network, each neuron may output a function value of an activation function for input signals input through a synapse, a weight, a bias, and the like. In artificial neural networks, learning is to determine the optimal model parameters that minimize the loss function. Machine learning implemented as a deep neural network (DNN) including a plurality of hidden layers among artificial neural networks is called deep learning.

단계(S330)는 일련의 서브 사운드 중 현재 서브 사운드를 소정의 주파수 대역에 대한 일 세트의 주파수 성분값으로 변환하는 단계 및 일 세트의 주파수 성분값에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성하는 단계를 포함할 수 있다. 단계(S330)의 세부 단계는 사고음 학습 모델의 실시예에 따라 도 6 및 도 10을 참조하여 구체적으로 후술하기로 한다.Step S330 is a step of converting a current sub sound among a series of sub sounds into a set of frequency component values for a predetermined frequency band, and generating input data corresponding to the current sub sound based on the set of frequency component values. may include the step of Detailed steps of step S330 will be described later in detail with reference to FIGS. 6 and 10 according to an embodiment of the thought sound learning model.

계속하여 단계(S340)에서 프로세서(130)는 일련의 입력 데이터에 기반한 사고음 학습 모델의 실행 결과에 따라 입력 사운드가 사고음을 포함하는 지 여부에 대한 검지 결과를 결정할 수 있다.Subsequently, in step S340 , the processor 130 may determine the detection result of whether the input sound includes an accident sound according to the execution result of the accident sound learning model based on a series of input data.

단계(S340)에서 프로세서(130)는, 단계(S330)에서 생성된 일련의 입력 데이터에 기반하여 사고음 학습 모델을 실행한다. 예를 들어 사고음 학습 모델은 DNN 또는 RNN에 기반한 딥러닝 알고리즘을 통해 구현된 학습 모델이다.In step S340, the processor 130 executes the thought sound learning model based on the series of input data generated in step S330. For example, a thought sound learning model is a learning model implemented through a deep learning algorithm based on DNN or RNN.

후속하여 프로세서(130)는 단계(S330)에서 사고음 학습 모델의 실행 결과에 따라 입력 사운드가 사고음을 포함하는 지 여부에 대한 검지 결과를 결정한다. 예를 들어 검지 결과는 입력 사운드가 사고음을 포함하는 지 여부, 사고음의 종류, 사고 확률 및 검지 결과의 강도 등을 포함할 수 있다.Subsequently, the processor 130 determines a detection result of whether the input sound includes an accident sound according to the execution result of the accident sound learning model in step S330 . For example, the detection result may include whether the input sound includes an accident sound, the type of the accident sound, an accident probability, and the strength of the detection result.

추가적으로 실시예에 따른 사고 검지 방법은 단계(S340)의 검지 결과에 기반하여 사고가 발생한 원격 위치의 근방에 설치된 카메라를 제어하기 위한 명령을 생성하는 단계를 더 포함할 수 있다.Additionally, the accident detection method according to the embodiment may further include generating a command for controlling a camera installed in the vicinity of a remote location where the accident occurred based on the detection result of step S340 .

추가적으로 실시예에 따른 사고 검지 방법은 단계(S340)의 검지 결과에 기반하여 사고 정보에 대한 알림 메시지를 생성하는 단계를 더 포함할 수 있다.Additionally, the accident detection method according to the embodiment may further include generating a notification message for accident information based on the detection result of step S340 .

도 6은 일 실시예에 따른 입력 데이터 생성 과정의 흐름도이다.6 is a flowchart of a process of generating input data according to an exemplary embodiment.

도 6은 DNN에 기반한 사고음 학습 모델을 위한 입력 데이터의 생성 과정을 보여준다.6 shows a process of generating input data for a thought sound learning model based on DNN.

도 4를 참조하여 단계(S330)은 일련의 서브 사운드 중 현재 서브 사운드를 소정의 주파수 대역에 대한 일 세트의 주파수 성분값으로 변환하는 단계(단계 S331) 및 일 세트의 주파수 성분값에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성하는 단계(단계 S332a 내지 S334a)를 포함한다.Referring to FIG. 4 , step S330 includes converting a current sub sound among a series of sub sounds into a set of frequency component values for a predetermined frequency band (step S331) and a current sub sound based on the set of frequency component values. and generating input data corresponding to the sub sound (steps S332a to S334a).

단계(S331)에서 프로세서(130)는 도 4를 참조하여 단계(S320)에서 획득한 일련의 서브 사운드 중에서 현재 서브 사운드를 소정의 주파수 대역에 대한 일 세트의 주파수 성분값으로 변환할 수 있다.In step S331, the processor 130 may convert the current sub sound from among the series of sub sounds obtained in step S320 with reference to FIG. 4 into a set of frequency component values for a predetermined frequency band.

일 예에서 프로세서(130)는 일련의 서브 사운드 중에서 현재 서브 사운드를 소정의 주파수 대역에 대한 일 세트의 주파수 성분값으로 변환할 수 있다. 예를 들어 프로세서(130)는 고속 푸리에 변환(FFT)를 통해 일련의 서브 사운드 중에서 현재 서브 사운드를 소정의 주파수 대역에 대한 일 세트의 주파수 성분값으로 변환할 수 있다.In an example, the processor 130 may convert a current sub-sound among a series of sub-sounds into a set of frequency component values for a predetermined frequency band. For example, the processor 130 may convert a current sub sound from among a series of sub sounds into a set of frequency component values for a predetermined frequency band through a fast Fourier transform (FFT).

일 예에서 프로세서(130)는 변환 결과 생성된 주파수 대역을 소정의 구간으로 나누어 배열로 저장할 수 있다. 프로세서(130)는 생성된 배열을 일 세트의 주파수 성분값으로 저장할 수 있다. 예를 들어 고속 푸리에 변환 결과 생성된 주파수 대역은 약 300Hz 내지 7kHz의 범위일 수 있다. 예를 들어, 소정의 구간은 약 10Hz이고, 약 650 내지 700개의 배열이 생성될 수 있다.In one example, the processor 130 may divide the frequency band generated as a result of the conversion into predetermined sections and store the divided frequency bands in an array. The processor 130 may store the generated array as a set of frequency component values. For example, a frequency band generated as a result of the fast Fourier transform may be in the range of about 300 Hz to 7 kHz. For example, the predetermined interval is about 10 Hz, and about 650 to 700 arrays may be generated.

후속하여 프로세서(130)는 단계(S331a)에서 획득한 일 세트의 주파수 성분값에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성할 수 있다. 이는 단계(S332a) 내지 단계(S334a)를 포함한다.Subsequently, the processor 130 may generate input data corresponding to the current sub sound based on the set of frequency component values obtained in step S331a. This includes steps S332a to S334a.

단계(S332a)에서 프로세서(130)는 단계(S331a)에서 획득한 일 세트의 주파수 성분값에 대한 제 1 배열 및 기저 사운드의 주파수 성분값에 대한 제 2 배열 간의 차이에 따라 제 1 세트의 분류값을 결정한다. 여기서 기저 사운드는 단계(S320)에서 획득한 일련의 서브 사운드 중 현재 서브 사운드 보다 과거의 적어도 하나의 서브 사운드를 포함하는 사운드를 의미한다.In step S332a, the processor 130 sets the first set of classification values according to the difference between the first arrangement for the set of frequency component values obtained in step S331a and the second arrangement for the frequency component values of the base sound. to decide Here, the base sound means a sound including at least one sub-sound that is older than the current sub-sound among the series of sub-sounds obtained in step S320.

일 예에서 제 1 세트의 분류값은 기저 사운드(예를 들어 2초 길이)의 주파수 성분값에 대한 제 2 배열과 현재 사운드(예를 들어 1초)의 주파수 성분값에 대한 제 1 배열의 각 대응되는 엘리먼트 간의 차이가 소정의 상수배 이상인 지를 판단한 결과가 참일 경우에 1의 값을 가지고, 거짓 일 경우에 0으로 결정된 0 또는 1로 이루어진 배열이다. 예를 들어 소정의 상수는 1보다 클 수 있다. 예를 들어 소정의 상수는 1.7일 수 있으나 이에 제한되는 것은 아니다. 실시예에 따른 사고 감지 장치(100)의 사고 검지 방법의 누적된 실시 결과에 따라 전술한 소정의 상수는 적절한 값으로 조정 가능하다.In one example, the first set of classification values includes each of the second array for frequency component values of the base sound (eg 2 seconds long) and the first array for frequency component values of the current sound (eg 1 second). It is an array consisting of 0 or 1 determined to have a value of 1 when the result of determining whether the difference between the corresponding elements is a predetermined constant multiple or more is true and 0 when false. For example, the predetermined constant may be greater than 1. For example, the predetermined constant may be 1.7, but is not limited thereto. According to the accumulated results of the accident detection method of the accident detection apparatus 100 according to the embodiment, the above-described predetermined constant can be adjusted to an appropriate value.

단계(S333a)에서 프로세서(130)는 제 1 배열 및 제 2 배열의 각 주파수 성분값의 변화를 정규화하여 제 2 세트의 분류값을 결정한다.In step S333a, the processor 130 determines the second set of classification values by normalizing the changes in each frequency component value of the first array and the second array.

일 예에서 제 2 세트의 분류값은 제 1 배열 및 제 2 배열의 각 대응하는 엘리먼트 간의 주파수 성분값의 변화에 기초하여 결정될 수 있다.In an example, the second set of classification values may be determined based on a change in frequency component values between each corresponding element of the first arrangement and the second arrangement.

예를 들어 제 2 세트의 분류값은 전술한 제 1 배열 및 제 2 배열의 각 대응하는 엘리먼트 간의 주파수 성분값의 차이를, 최대 주파수 성분값의 차이로 나눈 값을 각 엘리먼트의 값으로 저장한 배열이다. 다시 말하면, 제 2 세트의 분류값의 각 엘리먼트는, 제 1 배열 및 제 2 배열의 각 대응하는 엘리먼트들 간의 주파수 성분값의 차이를 최대 주파수 성분값의 차이에 의해 정규화한 값으로 결정된다. 여기서 최대 주파수 성분값의 차이는 제 1 배열 및 제 2 배열의 대응 엘리먼트 간의 주파수 성분값의 차이 중에서 가장 큰 값을 의미한다. 일 예에서 제 2 세트의 분류값은 정규화의 결과로 얻어지므로 0 내지 1 사이의 값을 가질 수 있다.For example, the second set of classification values is an array in which a value obtained by dividing a difference in frequency component values between respective corresponding elements of the above-described first and second arrays by a difference in maximum frequency component values is stored as the value of each element to be. In other words, each element of the second set of classification values is determined as a value obtained by normalizing a difference in a frequency component value between each corresponding element of the first arrangement and the second arrangement by a difference in the maximum frequency component value. Here, the difference in the maximum frequency component value means the largest value among the differences in frequency component values between the corresponding elements of the first arrangement and the second arrangement. In an example, since the second set of classification values is obtained as a result of normalization, it may have a value between 0 and 1.

단계(S334a)에서 프로세서(130)는 단계(S332a)에서 획득한 제 1 세트의 분류값 및 단계(S333a)에서 획득한 제 2 세트의 분류값 중 적어도 하나에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성한다. 예를 들어 프로세서(130)는 제 1 세트의 분류값에 대응하는 배열과 제 2 세트의 분류값에 대응하는 배열을 통합하여 하나의 배열을 생성하고, 생성된 하나의 배열을 입력 데이터로 사용할 수 있다. 예를 들어 프로세서(130)는 제 1 세트의 분류값에 대응하는 배열과 제 2 세트의 분류값에 대응하는 배열을 연결하여 하나의 배열을 생성하고, 생성된 하나의 배열을 입력 데이터로 사용할 수 있다. 예를 들어 프로세서(130)는 제 1 세트의 분류값을 입력 데이터로 사용할 수 있다. 예를 들어 프로세서(130)는 제 2 세트의 분류값을 입력 데이터로 사용할 수 있다.In step S334a, the processor 130 receives an input corresponding to the current sub sound based on at least one of the first set of classification values obtained in step S332a and the second set of classification values obtained in step S333a. create data For example, the processor 130 may generate one array by integrating the array corresponding to the first set of classification values and the array corresponding to the second set of classification values, and use the generated one array as input data. have. For example, the processor 130 may generate one array by linking the array corresponding to the first set of classification values and the array corresponding to the second set of classification values, and use the generated one array as input data. have. For example, the processor 130 may use the first set of classification values as input data. For example, the processor 130 may use the second set of classification values as input data.

도 7은 일 실시예에 따른 입력 데이터 생성을 위한 분류를 예시적으로 보여주는 그래프이다. 그래프(710) 및 그래프(720)에서 가로축은 시간(msec)을 나타내고 세로축은 주파수(Hz)를 나타낸다.7 is a graph exemplarily illustrating classification for generating input data according to an embodiment. In the graphs 710 and 720 , the horizontal axis indicates time (msec) and the vertical axis indicates frequency (Hz).

그래프(710)은 도 6을 참조하여 단계(S332a)에서 생성된 제 1 세트의 분류값을 예시적으로 보여준다.The graph 710 exemplarily shows the first set of classification values generated in step S332a with reference to FIG. 6 .

그래프(720)은 도 6을 참조하여 단계(S333a)에서 생성된 제 2 세트의 분류값을 예시적으로 보여준다. 그래프(720)에서 진하게 표시될수록 주파수 대역의 변화가 큰 영역에 해당하며, 예를 들어 대략 150Hz 근방의 대역에서 기저 사운드와 현재 서브 사운드 간의 주파수 변화가 크다는 것을 알 수 있다. The graph 720 exemplarily shows the second set of classification values generated in step S333a with reference to FIG. 6 . It can be seen that the darker the color of the graph 720, the greater the change in the frequency band is, for example, the greater the frequency change between the base sound and the current sub sound in a band around 150 Hz.

도 8은 일 실시예에 따른 사고음 학습 모델을 예시적으로 보여주는 도면이다.8 is a diagram exemplarily illustrating a thought sound learning model according to an embodiment.

도 8은 DNN에 기반한 예시적인 사고음 학습 모델을 보여준다. 사고음 학습 모델은 모델 저장부(123)에 저장된다.8 shows an exemplary thinking sound learning model based on DNN. The thinking sound learning model is stored in the model storage unit 123 .

도 4의 단계(S330)에서 생성된 입력 데이터로서, 구체적으로 도 6의 단계(S331a) 내지 단계(S334a)를 통해 생성된 입력 데이터(F)는 하나의 배열로서 1차원 벡터에 대응한다. 예를 들어 입력 데이터(F)는 1340x1 벡터이다.As input data generated in step S330 of FIG. 4 , specifically, input data F generated through steps S331a to S334a of FIG. 6 corresponds to a one-dimensional vector as an array. For example, the input data F is a 1340x1 vector.

프로세서(130)는 입력 데이터(F)를 도 8에 도시된 사고음 학습 모델에 입력값으로 투입하여 레이블 값(사고 여부 및 사고음의 종류), 데이터 강도 및 확률을 출력값으로 도출한다. 예를 들어 레이블 값은 배경음(b0), 노킹음(b1), 경적음(b2), 앰뷸런스 사이렌(b3), 파열음(airbreak)(b4), 충돌음(n1) 및 스키드음(n2)을 포함할 수 있다.The processor 130 inputs the input data F to the accident sound learning model shown in FIG. 8 as input values, and derives the label values (whether or not an accident and the type of the accident sound), data strength, and probability as output values. For example, label values can include background sound (b0), knocking sound (b1), horn sound (b2), ambulance siren (b3), airbreak sound (b4), crash sound (n1), and skid sound (n2). have.

프로세서(130)는 도 8에 도시된 사고음 학습 모델의 각각의 선을 따라 노드들의 가중치(weight)와 활성화 함수(Activation Function, 예를 들어 Sigmoid, Relu 등)가 계산되고, 최종적으로 softmax() 함수를 거쳐서 출력값을 획득한다. 프로세서(130)는 획득한 출력값에 기초하여 입력 사운드가 어떤 음향 데이터인지를 예측하게 된다.The processor 130 calculates a weight and an activation function (eg, Sigmoid, Relu, etc.) of nodes along each line of the thought sound learning model shown in FIG. 8 , and finally softmax() Get the output value through the function. The processor 130 predicts what kind of acoustic data the input sound is based on the obtained output value.

도 9는 일 실시예에 따른 사고 발생 여부를 결정하는 과정을 설명하기 위한 도면이다.9 is a diagram for explaining a process of determining whether an accident has occurred, according to an exemplary embodiment.

일련의 서브 사운드에 대한 사고음 학습 모델의 일련의 실행 결과를 획득할 수 있다. 도 9를 참조하여, 사고음 학습 모델에 대한 일련의 실행 결과의 패턴에 기반하여 사고 여부를 판단하는 과정을 살펴본다.It is possible to obtain a series of execution results of the thought sound learning model for a series of sub sounds. Referring to FIG. 9 , a process of determining whether an accident occurs based on a pattern of a series of execution results for the thought sound learning model will be described.

일 예에서 프로세서(130)는 사고음 학습 모델의 실행 결과를 복수 개 획득하고, 복수 개의 실행 결과에 기반하여 입력 사운드가 사고음을 포함하는 지 여부에 대한 검지 결과를 결정할 수 있다.In one example, the processor 130 may obtain a plurality of execution results of the accident sound learning model, and determine a detection result of whether the input sound includes an accident sound based on the plurality of execution results.

도 9를 참조하여 살펴보면, 예를 들어 일련의 서브 사운드에 대한 사고음 학습 모델의 일련의 실행 결과는 b, n, b, b, n ?? b라고 가정한다. 여기서 b는 검지 결과가 사고음이 아닌 노멀 사운드인 경우이고, n은 검지 결과가 사고음인 경우를 의미한다.Referring to FIG. 9 , for example, a series of execution results of the thinking sound learning model for a series of sub sounds are b, n, b, b, n ?? Assume b. Here, b denotes a case in which the detection result is a normal sound rather than an accidental sound, and n denotes a case in which the detection result is an accidental sound.

프로세서(130)는 복수 개의 실행 결과를 묶는 윈도우를 사용하여 사고 여부를 판단하는 과정을 추가적으로 수행할 수 있다.The processor 130 may additionally perform a process of determining whether an accident has occurred by using a window that binds a plurality of execution results.

예를 들어, 윈도우 크기를 3이라고 한다. 도 9의 예시적인 실행 결과 패턴에서 i번째 시점(Ti)에서의 윈도우(W_Ti)는 세 개의 실행 결과(b, n, b)를 포함하고, j번째 시점(Tj)에서의 윈도우 (W_Tj)는 세 개의 실행 결과(n, b, n)을 포함한다고 가정한다.For example, let's say the window size is 3. In the exemplary execution result pattern of FIG. 9 , the window W_Ti at the i-th time point Ti includes three execution results b, n, and b, and the window W_Tj at the j-th time point Tj is Assume that it contains three execution results (n, b, n).

일 예에서, 프로세서(130)는 윈도우에 포함된 실행 결과 중 사고음인 경우(즉, 실행 결과가 n인 경우)의 횟수에 따라 사고 여부를 결정할 수 있다. 예를 들어 크기가 3인 윈도우(W_Ti)에 포함된 실행 결과 중 사고음인 횟수가 2회 미만이므로, 프로세서(130)는 사고가 발생하지 않은 것으로 판단할 수 있다. 예를 들어 크기가 3인 윈도우(W_Tj)에 포함된 실행 결과 중 사고음인 횟수가 2회 이상이므로 프로세서(130)는 사고가 발생한 것으로 판단할 수 있다.In one example, the processor 130 may determine whether an accident occurs according to the number of times of an accident sound among execution results included in the window (ie, when the execution result is n). For example, since the number of times of an accident sound among execution results included in the window W_Ti having a size of 3 is less than two times, the processor 130 may determine that an accident has not occurred. For example, since the number of times of an accident sound among execution results included in the window W_Tj having a size of 3 is two or more, the processor 130 may determine that an accident has occurred.

일 예에서, 프로세서(130)는 윈도우에 포함된 실행 결과 중 사고음인 경우의 비율에 따라 사고 여부를 결정할 수 있다. 일 예에서, 프로세서(130)는 윈도우에 포함된 실행 결과 내에서의 사고음인 경우(검지 결과가 n인 경우)의 패턴에 따라 사고 여부를 결정할 수 있다. 이를 위하여 프로세서(130)는 일련의 실행 결과의 패턴을 임의의 시점(Tx)에서의 윈도우(W_Tx)에 따라 분석하는 함수 Pattern(W_Tx)를 이용할 수 있다.In one example, the processor 130 may determine whether an accident occurs according to a ratio of an accident sound among execution results included in the window. In one example, the processor 130 may determine whether or not an accident occurs according to a pattern of an accident sound in the execution result included in the window (when the detection result is n). To this end, the processor 130 may use a function Pattern (W_Tx) that analyzes a pattern of a series of execution results according to a window (W_Tx) at an arbitrary time point (Tx).

이와 같이 사고음 학습 모델의 실행 결과를 복수 개 획득하고, 복수 개의 실행 결과에 기반하여 입력 사운드가 사고음을 포함하는 지 여부에 대한 검지 결과를 결정함으로써, 시간의 연속성이 반영된 사고음 분석이 가능해진다. 즉, 실시예에 따른 사고 검지 장치에 의한 사고 검지 방법은 입력 사운드를 일련의 서브 사운드로 분할하고 각 서브 사운드에 대한 사고음 분석뿐만 아니라 연속한 서브 사운드들 간의 사고음 분석 결과에 기반하여 최종적인 사고 검지 결과를 도출하므로 정확성이 제고된다.In this way, by acquiring a plurality of execution results of the accident sound learning model and determining the detection result of whether the input sound includes an accident sound based on the plurality of execution results, it is possible to analyze the thought sound reflecting the continuity of time becomes That is, the accident detection method by the accident detection device according to the embodiment divides the input sound into a series of sub sounds, and the final result is based on the accident sound analysis results for each sub sound as well as the accident sound analysis results between successive sub sounds. Accuracy is improved by deriving accident detection results.

도 10은 일 실시예에 따른 입력 데이터 생성 과정의 흐름도이다.10 is a flowchart of a process of generating input data according to an exemplary embodiment.

도 10은 순환 신경망(RNN)에 기반한 사고음 학습 모델을 위한 입력 데이터의 생성 과정을 보여준다.10 shows a process of generating input data for a thought sound learning model based on a recurrent neural network (RNN).

RNN은 입력값이 시간과 같이 순차적인 성질을 가진 경우에 이전 입력값은 다음 입력값에 영향을 주는 구조를 가진다. RNN에 기반한 사고음 학습 모델은 순환 레이어를 통해 이전 입력값의 상태를 고려하여 사고 여부를 판단할 수 있다.RNN has a structure in which the previous input value affects the next input value when the input value has a sequential property such as time. A thought sound learning model based on RNN can determine whether an accident occurred by considering the state of the previous input value through a cyclic layer.

도 4를 참조하여 단계(S330)은 일련의 서브 사운드 중 현재 서브 사운드를 소정의 주파수 대역에 대한 일 세트의 주파수 성분값으로 변환하는 단계(S331) 및 일 세트의 주파수 성분값에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성하는 단계(단계 S332b 및 S333b)를 포함한다.Referring to FIG. 4 , step S330 includes converting a current sub sound among a series of sub sounds into a set of frequency component values for a predetermined frequency band ( S331 ) and the current sub sound based on the set of frequency component values. and generating input data corresponding to the sound (steps S332b and S333b).

단계(S331)에서 프로세서(130)는 도 4를 참조하여 단계(S320)에서 획득한 일련의 서브 사운드 중에서 현재 서브 사운드를 소정의 주파수 대역에 대한 일 세트의 주파수 성분값으로 변환할 수 있다. 일 예에서 프로세서(130)는 일 세트의 주파수 성분값 획득을 위하여 고속 푸리에 변환을 수행할 수 있다. 예를 들어 프로세서(130)는 고속 푸리에 변환을 통해 시간 도메인에 있던 서브 사운드를 주파수 도메인의 값으로 변환한다. In step S331, the processor 130 may convert the current sub sound from among the series of sub sounds obtained in step S320 with reference to FIG. 4 into a set of frequency component values for a predetermined frequency band. In an example, the processor 130 may perform fast Fourier transform to obtain a set of frequency component values. For example, the processor 130 converts the sub sound in the time domain into a value in the frequency domain through fast Fourier transform.

단계(S332b)에서 프로세서(130)는 단계(S331)에서 변환된 일 세트의 주파수 성분값에 대한 특징값의 행렬을 생성한다. 일 예에서 프로세서(S332b)는 멜 필터 뱅크(Mel Filter Bank)를 이용하여 일 세트의 주파수 성분값에 대한 유효한 특징값의 행렬을 추출할 수 있으나, 이에 제한되는 것은 아니며 그밖에 다양한 특징 추출 방법이 사용될 수 있다. 일 예에서 10초 길이의 서브 사운드에 대하여 150 x 100의 특징값의 행렬이 생성될 수 있다.In step S332b, the processor 130 generates a matrix of feature values for a set of frequency component values converted in step S331. In an example, the processor S332b may extract a matrix of valid feature values for a set of frequency component values using a Mel filter bank, but is not limited thereto, and various other feature extraction methods may be used. can In one example, a matrix of feature values of 150 x 100 may be generated for a sub sound having a length of 10 seconds.

단계(S333b)에서 프로세서(130)는 단계(S332b)에서 생성된 특징값의 행렬을 소정의 시간 구간에 따라 분할하여 생성된 복수 개의 서브 행렬에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성한다. 도 11의 예시를 참조하여 이하에서 살펴본다.In step S333b, the processor 130 generates input data corresponding to the current sub sound based on a plurality of sub-matrices generated by dividing the matrix of feature values generated in step S332b according to a predetermined time interval. . It will be looked at below with reference to the example of FIG. 11 .

도 11은 일 실시예에 따른 입력 데이터 생성을 위해 분할된 행렬 데이터를 예시적으로 보여주는 도면이다.11 is a diagram exemplarily illustrating matrix data divided to generate input data according to an embodiment.

전술한대로 단계(S333b)에서 프로세서(130)는 단계(S332b)에서 생성된 특징값의 행렬을 소정의 시간 구간에 따라 분할하여 생성된 복수 개의 서브 행렬에 기반하여 현재 서브 사운드에 대응하는 입력 데이터를 생성한다.As described above, in step S333b, the processor 130 divides the matrix of feature values generated in step S332b according to a predetermined time interval and generates input data corresponding to the current sub-sound based on a plurality of sub-matrices. create

도 11은 예를 들어 10초 길이의 서브 사운드에 대한 150x100의 특징값의 행렬을 2초 구간으로 나누어 생성된 5개의 서브 행렬(x1 내지 x5)을 예시적으로 도시한다. 다른 예에서 2초 보다 더 길거나 짧은 구간으로 나눌 수 있으며, 누적된 사고 검지 결과에 따라 소정의 시간 구간의 길이를 조정할 수 있다.FIG. 11 exemplarily shows five sub-matrices (x1 to x5) generated by dividing a matrix of feature values of 150×100 for a 10-second sub sound into 2-second sections. In another example, it can be divided into sections longer or shorter than 2 seconds, and the length of a predetermined time section can be adjusted according to the accumulated accident detection results.

도 12는 일 실시예에 따른 사고음 학습 모델을 예시적으로 보여주는 도면이다.12 is a diagram exemplarily illustrating a thought sound learning model according to an embodiment.

도 12는 RNN에 기반한 예시적인 사고음 학습 모델을 보여준다. 사고음 학습 모델은 모델 저장부(123)에 저장된다.12 shows an exemplary thinking sound learning model based on RNN. The thinking sound learning model is stored in the model storage unit 123 .

순환 신경망(RNN, Recurrent Neural Network)에 기반한 사고음 학습 모델은 입력값에 가중치를 곱하고 편향을 더하여 계산된 히든 스테이트(hidden state)와 이전 히든 스테이트에 가중치를 곱하여 더한 것에 활성 함수로 쌍곡탄젠트(hyperbolic tangent)를 사용하여 나온 결과가 다음 히든 스테이트가 됨으로써, 시간적 의존성을 고려할 수 있다.A thought sound learning model based on a recurrent neural network (RNN) is a hidden state calculated by multiplying an input value by weight and adding a bias, and a hyperbolic tangent (hyperbolic tangent) as an activation function to a hidden state calculated by multiplying the previous hidden state by weight and adding a weight. tangent) becomes the next hidden state, so temporal dependence can be considered.

임의의 시점 t의 히든 스테이트는 수학식 1로 나타낼 수 있다.The hidden state at an arbitrary time point t can be expressed by Equation (1).

활성함수로 ReLU를 사용하는 DNN, CNN과 달리 RNN는 쌍곡탄젠트 함수를 사용한다. 이 때, 쌍곡탄젠트 함수의 식은 수학식 2와 같다.Unlike DNN and CNN, which use ReLU as an activation function, RNN uses a hyperbolic tangent function. At this time, the expression of the hyperbolic tangent function is as Equation (2).

도 12는 이와 같은 과정을 전체 입력값에 대하여 도식화한 사고음 학습 모델을 예시적으로 보여준다. 프로세서(130)는 도12에 도시된 예시적으로 도시된 사고음 학습 모델의 최후 출력값(y15)에 기초하여 사고 검지 결과를 결정할 수 있다.12 exemplarily shows a thinking sound learning model in which such a process is schematized for all input values. The processor 130 may determine the accident detection result based on the final output value y15 of the accident sound learning model illustrated in FIG. 12 .

이상에서 실시예에 따른 사고 검지 장치(100)의 사고 분석부(100a)의 동작을 살펴보았다. 사고 처리부(100b)는 사고 분석부(100a)의 검지 결과에 따라 대응 동작을 결정 및 수행한다. 이하에서 보다 구체적으로 살펴본다.In the above, the operation of the accident analysis unit 100a of the accident detection device 100 according to the embodiment has been reviewed. The accident processing unit 100b determines and performs a response operation according to the detection result of the accident analysis unit 100a. It will be looked at in more detail below.

도 13은 일 실시예에 따른 사고 처리부의 동작 과정을 보여주는 도면이다.13 is a diagram illustrating an operation process of an accident processing unit according to an exemplary embodiment.

사고 처리부(100b)는 사고 분석부(100a)가 결정한 사고 검지 결과 및 사운드 수집기(10)로 수집한 입력 사운드에 기초하여 대응 동작을 결정할 수 있다. 여기서 입력 사운드는 사고 분석부(100a)가 사운드 수집기(10)로부터 수신하여 저장부(120) 및/또는 데이터베이스(122)에 저장한 사운드에 대응한다.The accident processing unit 100b may determine a response action based on the accident detection result determined by the accident analysis unit 100a and the input sound collected by the sound collector 10 . Here, the input sound corresponds to the sound received by the accident analysis unit 100a from the sound collector 10 and stored in the storage unit 120 and/or the database 122 .

사고 분석부(100a)의 사고 검지 결과, 사고가 발생한 것으로 판단된 경우, 운영 장치(200)의 운영자는 실제 사고 위치의 상황을 CCTV를 통해 확인할 필요가 있다. 프로세서(130)는 사고 검지 결과를 운영 장치(200)의 운영자에게 알리기 위하여, 사고 분석부(100a)의 사고 검지 결과에 기반하여 사고 정보에 대한 알림 메시지를 생성하도록 구성될 수 있다. 생성된 알림 메시지는 통신부(110)를 통해 운영 장치(200)에게 전송될 수 있다.As a result of the accident detection of the accident analysis unit 100a, when it is determined that an accident has occurred, the operator of the operating device 200 needs to check the situation of the actual accident location through CCTV. The processor 130 may be configured to generate a notification message for accident information based on the accident detection result of the accident analysis unit 100a in order to inform the operator of the operation device 200 of the accident detection result. The generated notification message may be transmitted to the operating device 200 through the communication unit 110 .

한편, 사고 처리부(100b)는 사고의 위치를 추정하고 사고 위치의 근방에 배치된 CCTV를 제어하여 자동으로 사고 현장 화면을 운영 장치(200)의 운영자에게 보여 줄 수 있다.On the other hand, the accident processing unit 100b may estimate the location of the accident and control the CCTV disposed in the vicinity of the accident location to automatically show the accident scene screen to the operator of the operating device 200 .

즉, 도 13을 참조하여 사고 처리부(100b)는 사고 방향 추정과 사고 거리를 추정하고, 추정된 사고 방향 및 거리에 기반하여 사고 위치를 추정하고, 사고 위치 근방의 CCTV 위치를 검색하고, CCTV를 제어할 수 있다. 이와 같은 사고 처리부(100b)의 일련의 동작은 프로세서(130)를 통하여 실제로 수행되고, 수행 결과 프로세서(130)는 사고 위치 근방의 CCTV를 제어하기 위한 제어 명령을 생성한다.That is, with reference to FIG. 13, the accident processing unit 100b estimates the accident direction estimation and the accident distance, estimates the accident location based on the estimated accident direction and distance, searches the CCTV location near the accident location, and uses the CCTV. can be controlled A series of operations of the accident processing unit 100b as described above are actually performed through the processor 130, and as a result of the execution, the processor 130 generates a control command for controlling the CCTV near the accident location.

1) 사고 방향 추정1) Estimation of accident direction

일반적으로 음원으로부터 발생한 음원은 스테레오 마이크에 도달할 때 마이크의 베이스라인 간격에 의한 시간 차이가 발생하게 된다. 이러한 시간차이(TDA, Time Difference of Arrival)를 측정함으로써 음원의 방향을 계산할 수 있다.In general, when a sound source generated from a sound source arrives at a stereo microphone, a time difference occurs due to the baseline interval of the microphone. By measuring such a time difference (TDA, Time Difference of Arrival), the direction of the sound source can be calculated.

프로세서(130)는 사운드 수집기(10)의 스테레오 마이크로 수집한 입력 사운드(즉, 스테레오 사운드)의 TDA를 Cross-correlation 연산을 통해 계산할 수 있다.The processor 130 may calculate the TDA of the input sound (ie, stereo sound) collected by the stereo microphone of the sound collector 10 through a cross-correlation operation.

이를 위하여 프로세서(130)는 시간 영역에서 입력 사운드에 포함된 좌측 사운드 신호 및 우측 사운드 신호를 주파수 영역으로 변환(예를 들어, FFT 수행)하고, 두 주파수 성분의 Cross-correlation을 계산한다. 프로세서(130)는 이와 같은 Cross-correlation의 결과를 다시 시간영역으로 변환하고 이때의 최대값까지의 시간을 입력 사운드에 포함된 좌우 사운드 신호의 TDA로 결정한다.To this end, the processor 130 converts the left sound signal and the right sound signal included in the input sound in the time domain into the frequency domain (eg, performs FFT), and calculates cross-correlation between the two frequency components. The processor 130 converts the cross-correlation result back into the time domain, and determines the time to the maximum value at this time as the TDA of the left and right sound signals included in the input sound.

예를 들어 TDA가 0인 경우, 마이크로폰에 수직한 방향에서 발생한 소리이고, 베이스라인 간격만큼 차이 나는 경우, +90도 또는 -90도에서 발생한 소리이며 이를 수식으로 표현하면 다음의 수학식 3과 같다.For example, if the TDA is 0, the sound is generated in a direction perpendicular to the microphone, and if it is different by the baseline interval, it is a sound generated at +90 degrees or -90 degrees. .

여기서 K는 베이스라인과 음파의 속도에 의해 결정된다.where K is determined by the baseline and the velocity of the sound wave.

즉, 프로세서(130)는 TDA가 0인 경우, 마이크로폰에 수직한 방향을 사고 방향으로 결정하고, TDA가 베이스라인 간격만큼의 차이인 경우, +90도 또는 -90도를 사고 방향으로 결정할 수 있다.That is, when the TDA is 0, the processor 130 determines the direction perpendicular to the microphone as the thinking direction, and when the TDA is the difference by the baseline interval, +90 degrees or -90 degrees can be determined as the thinking direction. .

2) 사고 거리 추정2) Estimation of accident distance

예를 들어 두 개의 사운드 수집기(10)에서 사고음이 수집된 경우 사고음의 신호의 최대값의 크기(S)는 음원으로부터의 거리 제곱에 반비례한다.For example, when an accident sound is collected by the two sound collectors 10 , the magnitude (S) of the maximum value of the signal of the accident sound is inversely proportional to the square of the distance from the sound source.

따라서 사운드 수집기(10)의 간격과 수집된 사고음의 크기 비에 의하여 두 개의 사운드 수집기(10) 사이의 거리를 알 수 있다. n번째 사운드 수집기(10)에서 측정한 사고음 신호의 최대값이

이고 n+1번째 사운드 수집기(10)에서 측정한 사고음 신호의 최대값이

일 때, 두 수집기 사이의 거리가 L이라면 다음 수학식 5에 의해서 프로세서(130)는 Sn에서 Sn+1로 향하는 사고의 위치를 추정할 수 있다.Therefore, the distance between the two sound collectors 10 can be known by the ratio of the interval between the sound collectors 10 and the amplitude of the collected accidental sounds. The maximum value of the accident sound signal measured by the nth sound collector 10 is

and the maximum value of the accident sound signal measured by the n+1th sound collector 10 is

When , if the distance between the two collectors is L, the processor 130 may estimate the location of the accident from Sn to Sn+1 by Equation 5 below.

3) 사고위치 추정3) Estimation of accident location

프로세서(130)는 두 개 또는 그 이상의 사고음으로부터 거리 및 방향을 추정하고 각각의 사고음의 사고 위치를 가중평균으로 하여 사고 위치를 추정할 수 있다.The processor 130 may estimate the accident location by estimating the distance and direction from two or more accident sounds and using the accident location of each accident sound as a weighted average.

4) CCTV 위치 검색4) CCTV location search

사고 위치가 결정되면 사고 처리부(100b)는 사고 위치를 바라보기에 적절한 CCTV를 선택한다. 일반적으로 도로 상에는 여러 개의 CCTV가 설치되어 있다. 따라서, 사고 처리부(100b)는 CCTV의 설치 위치와 운영 범위를 미리 데이터로 만들어 그 운영 범위 내에 사고 위치가 들어갈 수 있는 CCTV의 위치를 검색하고 사고 위치를 보여줄 수 있는 CCTV를 선택한다.When the accident location is determined, the accident processing unit 100b selects an appropriate CCTV to view the accident location. In general, several CCTVs are installed on the road. Therefore, the accident processing unit 100b makes the installation location and operating range of the CCTV as data in advance, searches for the location of the CCTV where the accident location can enter within the operating range, and selects the CCTV that can show the accident location.

5) CCTV 제어5) CCTV Control

제어할 CCTV가 선택되었다면 프로세서(130) 선택된 CCTV를 제어하기 위한 제어 명령을 생성한다. 예를 들어 제어 명령은 사고 위치에 해당하는 CCTV의 설정 테이블을 읽어 그 위치로 이동하도록 제어하는 명령을 포함한다. 예를 들어 제어 명령은 선택된 CCTV의 시야의 중심에 사고 위치가 오도록 계산된 회전 각도에 따라 CCTV의 시야를 회전하도록 제어하는 명령을 포함한다.If the CCTV to be controlled is selected, the processor 130 generates a control command for controlling the selected CCTV. For example, the control command includes a command to read the setting table of the CCTV corresponding to the accident location and control to move to the location. For example, the control command includes a command for controlling to rotate the field of view of the CCTV according to the calculated rotation angle so that the accident location comes to the center of the field of view of the selected CCTV.

도 14는 일 실시예에 따른 사고 검지 시스템의 데이터 흐름을 보여주는 도면이다.14 is a diagram illustrating a data flow of an accident detection system according to an embodiment.

전술한대로 사고 검지 시스템은 사운드 수집기(10) 및 사고 검지 장치(100)를 포함한다.As described above, the accident detection system includes a sound collector 10 and an accident detection device 100 .

사운드 수집기(10)는 도로 상에 배치되어 주위에서 발생하는 소리를 수집한다. 사운드 수집기(10)는 복수 개의 사운드 수집기를 포함할 수 있다.The sound collector 10 is disposed on the road and collects sounds generated in the surroundings. The sound collector 10 may include a plurality of sound collectors.

사고 검지 장치(100)는 도 2를 참조하여 사고 분석부(100a) 및 사고 처리부(100b)를 포함한다. 사고 검지 장치(100)는 도 3을 참조하여 전술한 프로세서(130)에 의해 실행되는 여러 프로세스(process) 간의 상호 연계에 의하여 사고 분석 및 처리를 위한 동작을 수행한다. 도 14는 이와 같은 사고 검지 장치(100)에서 구동되는 프로세스들과 이들 간의 데이터 흐름을 도시한 것이다. 이하에서 보다 구체적으로 살펴본다.The accident detection device 100 includes an accident analysis unit 100a and an accident processing unit 100b with reference to FIG. 2 . The accident detection device 100 performs an operation for accident analysis and processing by interconnection between various processes executed by the processor 130 described above with reference to FIG. 3 . 14 illustrates the processes driven in the accident detection device 100 and the data flow therebetween. It will be looked at in more detail below.

사운드 수집기(10)는 수집한 입력 사운드를 사고 검지 장치(100)에게 전송한다.The sound collector 10 transmits the collected input sound to the accident detection device 100 .

사고 검지 장치(100)는 프로세서(130)의 제어 하에 통신부(110)를 통해 사운드 수집기(10)로부터 입력 사운드를 수신한다.The accident detection device 100 receives an input sound from the sound collector 10 through the communication unit 110 under the control of the processor 130 .

수신된 입력 사운드는 단계(1400)에서 프로세서(130)의 제어 하에 CADS 프로세스로부터 MsgMan 프로세스에게 전달된다. 예를 들어 입력 사운드의 로 데이터(Raw data)가 CADS 프로세스로부터 MsgMan 프로세스에게 전달된다.The received input sound is transferred from the CADS process to the MsgMan process under the control of the processor 130 in step 1400 . For example, the raw data of the input sound is passed from the CADS process to the MsgMan process.

MsgMan 프로세스는 데이터 가공을 위해 입력 사운드에 대한 데이터를 필요로 하는 각 프로세스(예를 들어 DASAVE 프로세스, streamGW 프로세스 및 ADSS 프로세스 등)로 입력 사운드에 대한 데이터를 전달한다. The MsgMan process passes data about the input sound to each process that needs data about the input sound for data processing (eg DASAVE process, streamGW process, ADSS process, etc.).

단계(1401)에서 DBSAVE 프로세스는 프로세서(130)의 제어 하에 입력 사운드를 소정의 시간 단위로 나누어 저장부(120)에 로 데이터의 음향 파일로 저장한다. 또한 단계(1401)에서 DBSAVE 프로세스는 프로세서(130)의 제어 하에 입력 사운드를 소정의 시간 단위로 나누어 인덱싱 과정을 거쳐서 데이터베이스(122)에 저장한다. 예를 들어 소정의 시간 단위는 1분일 수 있으나, 이제 제한되는 것은 아니다.In step 1401 , the DBSAVE process divides the input sound into predetermined time units under the control of the processor 130 and stores it as a sound file of raw data in the storage unit 120 . Also, in step 1401 , the DBSAVE process divides the input sound into predetermined time units under the control of the processor 130 and stores it in the database 122 through an indexing process. For example, the predetermined time unit may be 1 minute, but is not limited thereto.

단계(1406)에서 StreamGW 프로세스는 프로세서(130)의 제어 하에 수집된 입력 사운드를 실시간으로 통신부(110)를 통해 외부 서버(WEB)로 전송할 수 있다. 외부 서버(WEB)는 수신한 입력 사운드를 실시간으로 운영 장치(200)로 전송할 수 있다.In step 1406 , the StreamGW process may transmit the collected input sound to the external server WEB through the communication unit 110 in real time under the control of the processor 130 . The external server WEB may transmit the received input sound to the operating device 200 in real time.

단계(1402)에서 ADSS 프로세스는 프로세서(130)의 제어 하에 수신한 입력 사운드를 분석하여 사고를 검지하고, 검지 결과를 후술할 GWIF 프로세스로 전송하여 데이터베이스(122)에 저장할 수 있다. In step 1402 , the ADSS process analyzes the received sound input under the control of the processor 130 to detect an accident, and transmits the detection result to the GWIF process to be described later and stores it in the database 122 .

GWIF 프로세스는 프로세서(130)의 제어 하에 외부 서버로부터 입력된 사운드 수집기(10) 및/또는 사고 검지 장치(100)에 대한 제어 정보와 트랜잭션의 모든 인터페이스를 담당한다. 여기서 외부 서버는 운영 장치(200)에게 운영 및 관제 서비스를 제공하는 서버를 포함한다.The GWIF process is responsible for all interfaces of transaction and control information for the sound collector 10 and/or the accident detection device 100 input from an external server under the control of the processor 130 . Here, the external server includes a server that provides operation and control services to the operating device 200 .

단계(1405)에서 GWIF 프로세스는 프로세서(130)의 제어 하에 통신부(110)를 통해 사고 검지 결과 및 사고 이벤트 알림 메시지 및 를 외부 서버(WEB)에게 전송한다.In step 1405 , the GWIF process transmits the accident detection result and the accident event notification message and to the external server WEB through the communication unit 110 under the control of the processor 130 .

단계(1403)에서 GWIF 프로세스는 프로세서(130)의 제어 하에 운영자가 운영 장치(200)에서 입력한 사고 이벤트 확정 메시지 및/또는 사고 이벤트 오보 메시지를 외부 서버(WEB)를 거쳐 통신부(110)를 통해 수신하고, 수신한 메시지를 데이터베이스(122)에 저장한다. 단계(1403)은 단계(1405)에 대한 응답에 해당한다.In step 1403, the GWIF process transmits the accident event confirmation message and/or the accident event misinformation message input by the operator from the operating device 200 under the control of the processor 130 through the external server (WEB) through the communication unit 110 received, and the received message is stored in the database 122 . Step 1403 corresponds to the response to step 1405 .

단계(1404)에서 GWIF 프로세스는 프로세서(130)의 제어 하에 사운드 수집기(10)의 제어 관련 설정 정보, 상태 정보 및 제어 이력 정보를 데이터베이스(122)에 저장할 수 있다.In operation 1404 , the GWIF process may store control-related setting information, state information, and control history information of the sound collector 10 in the database 122 under the control of the processor 130 .

요약하면, 사운드 수집기(10)로부터 수신한 입력 사운드가 사고 검지 장치(100)에서 실행 중인 각 프로세스로 전달되면, 각 프로세스는 수신한 입력 사운드를 저장부(120)에 저장하고 필요에 따라 운영 장치(200)로 실시간 스트리밍하고 입력 사운드가 사고음을 포함하는 지 여부에 대한 사고 검지의 기능을 수행할 수 있다. 또한, 사고 검지된 입력 사운드는 데이터베이스(122)에 저장되며, 관리 서버는 저장된 입력 사운드를 데이터 베이스(122)에서 추출하여 운영자를 위해 운영 장치(200)로 전송할 수 있다. In summary, when the input sound received from the sound collector 10 is delivered to each process running in the accident detection device 100 , each process stores the received input sound in the storage unit 120 and, if necessary, the operating device It is possible to perform real-time streaming to 200 and perform the function of accident detection as to whether the input sound includes an accident sound. In addition, the accident-detected input sound is stored in the database 122 , and the management server may extract the stored input sound from the database 122 and transmit it to the operating device 200 for the operator.

도 15는 일 실시예에 따른 사고 검지 시스템의 사고 검지 결과를 예시적으로 보여주는 표이다.15 is a table exemplarily showing an accident detection result of an accident detection system according to an embodiment.

도 15는 전체 69개의 30초의 사고(스키드 또는 충돌) 음향을 사운드 수집기(10)에 입력값으로 투입하였을 때, 실시예에 따른 사고 검지 장치(100)가 수행하는 사고 검지 방법의 예측 결과를 보여준다. 총 69개의 사고 음향 중에 63개를 사고음으로 검지하였고, 검지율은 91.30%에 달하였다. 미검지는 6건이나, 오검지율은 0.00%로 확인되었다.15 shows the prediction results of the accident detection method performed by the accident detection device 100 according to the embodiment when all 69 30-second accident (skid or collision) sounds are input to the sound collector 10 as input values. . Of the total 69 accident sounds, 63 were detected as accident sounds, and the detection rate reached 91.30%. There were 6 undetected cases, but the false detection rate was confirmed to be 0.00%.

도 16은 일 실시예에 따른 사고 검지 시스템의 사고 오검지 결과를 예시적으로 보여주는 표이다.16 is a table exemplarily showing an accident erroneous detection result of the accident detection system according to an embodiment.

도 16은 특정날짜(2019.7.20)에 24시간 동안 수집된 음향 데이터를 입력값으로 투입했을 때, 실시예에 따른 사고 검지 방법을 수행하는 사고 검지 장치(100)의 예측 결과를 보여준다. 두 번에 걸쳐 스키드음이 발생하였고, 첫번째 스키드(n1)는 제대로 검지하였으나 두번째 스키드(n2)는 오검지가 발생하였다.16 shows a prediction result of the accident detection device 100 performing the accident detection method according to the embodiment when acoustic data collected for 24 hours on a specific date (July 20, 2019) is input as an input value. The skid sound was generated twice, and the first skid (n1) was detected correctly, but the second skid (n2) was incorrectly detected.

이상 설명된 본 발명에 따른 실시예는 컴퓨터 상에서 다양한 구성요소를 통하여 실행될 수 있는 컴퓨터 프로그램의 형태로 구현될 수 있으며, 이와 같은 컴퓨터 프로그램은 컴퓨터로 판독 가능한 매체에 기록될 수 있다. 이때, 매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등과 같은, 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치를 포함할 수 있다.The above-described embodiment according to the present invention may be implemented in the form of a computer program that can be executed through various components on a computer, and such a computer program may be recorded in a computer-readable medium. In this case, the medium includes a hard disk, a magnetic medium such as a floppy disk and a magnetic tape, an optical recording medium such as CD-ROM and DVD, a magneto-optical medium such as a floppy disk, and a ROM. , RAM, flash memory, and the like, and hardware devices specially configured to store and execute program instructions.

한편, 상술한 컴퓨터 프로그램은 본 발명을 위하여 특별히 설계되고 구성된 것이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수 있다. 컴퓨터 프로그램의 예에는, 컴파일러에 의하여 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용하여 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함될 수 있다.Meanwhile, the above-described computer program may be specially designed and configured for the present invention, or may be known and used by those skilled in the art of computer software. Examples of the computer program may include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.

본 발명의 명세서(특히 청구범위에서)에서 "상기"의 용어 및 이와 유사한 지시 용어의 사용은 단수 및 복수 모두에 해당하는 것일 수 있다. 또한, 본 발명에서 범위(range)를 기재한 경우 상기 범위에 속하는 개별적인 값을 적용한 발명을 포함하는 것으로서(이에 반하는 기재가 없다면), 발명의 상세한 설명에 상기 범위를 구성하는 각 개별적인 값을 기재한 것과 같다.In the specification of the present invention (especially in the claims), the use of the term "the" and similar referential terms may be used in both the singular and the plural. In addition, when a range is described in the present invention, each individual value constituting the range is described in the detailed description of the invention as including the invention to which individual values belonging to the range are applied (unless there is a description to the contrary). same as

본 발명에 따른 방법을 구성하는 단계들에 대하여 명백하게 순서를 기재하거나 반하는 기재가 없다면, 상기 단계들은 적당한 순서로 행해질 수 있다. 반드시 상기 단계들의 기재 순서에 따라 본 발명이 한정되는 것은 아니다. 본 발명에서 모든 예들 또는 예시적인 용어(예들 들어, 등등)의 사용은 단순히 본 발명을 상세히 설명하기 위한 것으로서 청구범위에 의해 한정되지 않는 이상 상기 예들 또는 예시적인 용어로 인해 본 발명의 범위가 한정되는 것은 아니다. 또한, 당업자는 다양한 수정, 조합 및 변경이 부가된 청구의 범위 또는 그 균등물의 범주 내에서 설계 조건 및 팩터에 따라 구성될 수 있음을 알 수 있다.The steps constituting the method according to the present invention may be performed in an appropriate order unless the order is explicitly stated or there is no description to the contrary. The present invention is not necessarily limited to the order in which the steps are described. The use of all examples or exemplary terms (eg, etc.) in the present invention is merely for the purpose of describing the present invention in detail, and unless limited by the claims, the scope of the present invention is limited by the examples or exemplary terminology. it is not In addition, those skilled in the art will appreciate that various modifications, combinations, and changes may be made in accordance with design conditions and factors within the scope of the appended claims or their equivalents.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 청구의 범위뿐만 아니라 이 청구의 범위와 균등한 또는 이로부터 등가적으로 변경된 모든 범위는 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the above-described embodiments, and the scope of the spirit of the present invention is not limited to the scope of the scope of the present invention. will be said to belong to

앞에서, 본 발명의 특정한 실시예가 설명되고 도시되었지만 본 발명은 기재된 실시예에 한정되는 것이 아니고, 이 기술 분야에서 통상의 지식을 가진 자는 본 발명의 사상 및 범위를 벗어나지 않고서 다른 구체적인 실시예로 다양하게 수정 및 변형할 수 있음을 이해할 수 있을 것이다. 따라서, 본 발명의 범위는 설명된 실시예에 의하여 정하여 질 것이 아니고 청구범위에 기재된 기술적 사상에 의해 정하여져야 할 것이다.In the foregoing, specific embodiments of the present invention have been described and illustrated, but the present invention is not limited to the described embodiments, and those of ordinary skill in the art may make various changes to other specific embodiments without departing from the spirit and scope of the present invention. It will be appreciated that modifications and variations are possible. Accordingly, the scope of the present invention should not be defined by the described embodiments, but should be defined by the technical idea described in the claims.

10: 사운드 수집기
100: 사고 검지 장치
200: 운영 장치10: Sound Collector
100: accident detection device
200: operating device

Claims

A device for detecting an accident occurring at a remote location,
a memory buffer for storing a stream of input sound sensed through a sound collector disposed in the vicinity of the remote location; and
at least one processor;
The processor is
obtain a series of sub sounds from the stream of input sounds,
generating a series of input data for executing a thought sound learning model based on the series of sub sounds,
configured to determine a detection result as to whether the input sound includes an accident sound according to an execution result of the accident sound learning model based on the series of input data,
The processor, to generate input data corresponding to the current sound,
Converting a current sub sound among the series of sub sounds into a set of frequency component values for a predetermined frequency band, and a first arrangement for the set of frequency component values and a second arrangement for the frequency component values of the base sound A first set of classification values is determined according to a difference between and generate input data corresponding to the current sub-sound based on the second set of classification values, wherein the base sound includes at least one sub-sound prior to the current sub-sound among the series of sub-sounds. do,
The processor is configured to: a true value (TRUE) or a false value ( FALSE) as each classification value of the first set,
The processor is configured to determine the detection result based on a ratio of an accidental sound in a series of execution results of the accidental sound learning model to determine the detection result,
accident detection device.

The method of claim 1,
The processor is
and dividing the stream of input sound according to a predetermined length of time to obtain the series of sub-sounds;
accident detection device.

The method of claim 1,
the series of sub sounds includes a first sub sound and a second sub sound subsequent to the first sub sound,
The processor is
further configured to obtain, from the stream of input sound, the second sub-sound overlapping at least a portion of the first sub-sound,
accident detection device

The method of claim 1,
the series of sub sounds includes a first sub sound and a second sub sound subsequent to the first sub sound,
The processor is
further configured to obtain the second sub sound from a point where the first sub sound ends in the stream of the input sound,
accident detection device

delete

The method of claim 1,
The thought sound learning model is implemented using a deep learning algorithm based on DNN (Deep Neural Network),
accident detection device.

The method of claim 1,
The processor is
Further configured to generate a command for controlling a camera installed in the vicinity of the remote location based on the detection result,
accident detection device.

The method of claim 1,
The processor is
Further configured to generate a notification message for accident information based on the detection result,
accident detection device.

A method for detecting an accident occurring at a remote location, the method comprising:
acquiring a stream of the sensed input sound through a sound collector disposed in the vicinity of the remote location;
obtaining a series of sub sounds from the stream of input sounds;
generating a series of input data for executing a thought sound learning model based on the series of sub sounds; and
determining a detection result as to whether the input sound includes an accident sound according to an execution result of the accident sound learning model based on the series of input data;
The step of generating the series of input data includes:
converting a current sub sound among the series of sub sounds into a set of frequency component values for a predetermined frequency band; and
generating input data corresponding to the current sub sound based on the set of frequency component values;
The step of generating input data corresponding to the current sub sound includes:
determining a first set of classification values according to a difference between a first arrangement of the set of frequency component values and a second arrangement of frequency component values of a base sound;
determining a second set of classification values by normalizing changes in values of respective frequency components of the first array and the second array; and
generating the input data based on the first set of classification values and the second set of classification values;
the base sound includes at least one sub sound prior to the current sub sound among the series of sub sounds;
Determining the first set of classification values comprises:
Determining a true value (TRUE) or a false value (FALSE) as each classification value of the first set according to whether a difference between each corresponding frequency component value of the first array and the second array is a predetermined constant multiple or more including,
The step of determining the detection result is,
Comprising the step of determining the detection result based on the ratio of the accident sound in a series of execution results of the accident sound learning model,
How to detect an accident.

13. The method of claim 12,
The remote location is located inside the tunnel,
How to detect an accident.

delete

13. The method of claim 12,
Generating a command for controlling a camera installed in the vicinity of the remote location based on the detection result
further comprising,
How to detect an accident.

13. The method of claim 12,
generating a notification message for accident information based on the detection result
further comprising,
How to detect an accident.