TWI728632B

TWI728632B - Positioning method for specific sound source

Info

Publication number: TWI728632B
Application number: TW108148521A
Authority: TW
Inventors: 莊煒宜; 蔡曜隆
Original assignee: 財團法人工業技術研究院
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2021-05-21
Also published as: CN113126027A; TW202127058A

Abstract

A positioning method for a specific sound source is provided. Acoustic signals are collected through one sensor at various locations on a preset path. An algorithm is performed on the acoustic signals to obtain multiple signal characteristics. The signal characteristics are used as an input of a deep learning model, and the deep learning model is used for signal identification to obtain multiple specific audio signals at each position. An autocorrelation function operation is performed on these specific audio signals obtained at the same position to obtain multiple autocorrelation coefficients. A representative value is selected from these autocorrelation coefficients as a representative coefficient corresponding to each position. A specific sound source position is found according to the representative coefficient of each position.

Description

Location method of specific sound source

本揭露是有關於一種定位方法，且特別是有關於一種特定音源的定位方法。 The present disclosure relates to a positioning method, and particularly relates to a positioning method of a specific sound source.

目前檢測特定音源的發生點的方法大都需由檢測人員利用手持式感測設備逐步進行定點感測，再透過檢測人員的經驗來辨識所感測到的訊號是否為特定音訊。一旦確認為偵測到特定音訊後，檢測人員再依據其經驗來判斷該處是否為發出特定音訊的所在處。然而，由使用者進行辨識容易受到主觀意識、外在環境等因素的影響，使得辨識準確率不佳。 Most of the current methods for detecting the occurrence point of a specific sound source require the inspector to use a handheld sensing device to gradually perform fixed-point sensing, and then use the experience of the inspector to identify whether the sensed signal is a specific audio signal. Once it is confirmed that the specific audio is detected, the inspector will then determine whether the location is the location where the specific audio is emitted based on their experience. However, the identification by the user is easily affected by subjective consciousness, external environment and other factors, which makes the identification accuracy rate poor.

本揭露提供一種特定音源的定位方法，可有效提升辨識準確率。 The present disclosure provides a method for locating a specific sound source, which can effectively improve the recognition accuracy.

本揭露的特定音源的定位方法，包括：透過感測器分別在預設路徑上的多個位置中的每一個收集聲波訊號；對聲波訊號執行演算法，以取得多個訊號特徵；以所述訊號特徵作為深度學習模型的輸入，利用深度學習模型進行訊號辨識，以取得每一個位置的多個特定音訊訊號；對在同一位置所獲得的這些特定音訊訊號進行自相關函數運算，以獲得多個自相關係數；在這些自相關係數中選擇代表值作為各位置對應的代表係數；以及根據各位置的代表係數來找出特定音源位置。 The method for locating a specific sound source disclosed in the present disclosure includes: collecting a sound wave signal at each of a plurality of positions on a preset path through a sensor; Execute algorithms to obtain multiple signal features; use the signal features as the input of the deep learning model, and use the deep learning model for signal identification to obtain multiple specific audio signals at each location; These specific audio signals are subjected to autocorrelation function operations to obtain multiple autocorrelation coefficients; among these autocorrelation coefficients, representative values are selected as representative coefficients corresponding to each position; and specific audio source positions are found according to the representative coefficients of each position.

在本揭露的一實施例中，上述對聲波訊號執行演算法，以取得訊號特徵的步驟包括：對聲波訊號進行梅爾倒頻譜計算，將獲得的梅爾倒頻譜係數、一階微分梅爾倒頻譜係數以及二階微分梅爾倒頻譜係數作為訊號特徵。 In an embodiment of the present disclosure, the step of performing an algorithm on the acoustic signal to obtain signal characteristics includes: performing a Mel cepstrum calculation on the acoustic signal, and converting the obtained Mel cepstrum coefficients and first-order differential Mel inverts. Spectral coefficients and second-order differential Mel cepstral coefficients are used as signal characteristics.

在本揭露的一實施例中，上述根據各位置的代表係數來找出特定音源位置的步驟包括：在各位置的多個代表係數中找出最大值；以及將作為最大值的代表係數對應的位置判定為特定音源位置。 In an embodiment of the present disclosure, the step of finding the position of a specific sound source according to the representative coefficient of each position includes: finding the maximum value among the representative coefficients of each position; and corresponding the representative coefficient as the maximum value. The position is judged to be the position of the specific sound source.

在本揭露的一實施例中，上述根據各位置的代表係數來找出特定音源位置的步驟包括：計算相鄰兩個位置的兩個代表係數的差值，當差值大於閾值時，判定兩個代表係數中較大者所對應的位置為特定音源位置。 In an embodiment of the present disclosure, the step of finding a specific sound source position based on the representative coefficients of each position includes: calculating the difference between two representative coefficients of two adjacent positions, and when the difference is greater than a threshold, determining two The position corresponding to the larger of the representative coefficients is the specific sound source position.

在本揭露的一實施例中，上述深度學習模型為卷積神經網路。 In an embodiment of the present disclosure, the above-mentioned deep learning model is a convolutional neural network.

在本揭露的一實施例中，上述感測器基於取樣頻率以及取樣週期來取得多個取樣訊號，以所述取樣訊號作為該聲波訊號。 In an embodiment of the present disclosure, the sensor obtains a plurality of sampling signals based on the sampling frequency and the sampling period, and uses the sampling signals as the acoustic signal.

在本揭露的一實施例中，上述利用深度學習模型進行訊號辨識包括：利用深度學習模型判斷聲波訊號是否屬於特定音訊；以及在判定聲波訊號屬於特定音訊時，以取樣訊號作為特定音訊訊號。 In an embodiment of the present disclosure, using the deep learning model to perform signal recognition includes: using the deep learning model to determine whether the sound wave signal belongs to a specific audio signal; and when determining that the sound wave signal belongs to the specific audio signal, using the sampled signal as the specific audio signal.

在本揭露的一實施例中，上述代表值係為自相關係數中的最大值。 In an embodiment of the present disclosure, the above-mentioned representative value is the maximum value of the autocorrelation coefficient.

基於上述，透過深度學習模型即時辨識所量測之聲波訊號，並以訊號相關性分析方法進行訊號發生位置之定位。透過量測音訊能即時診斷特定事件，並定位該事件發生位置，以縮短事件處理時間及提升事件處理效率。 Based on the above, the measured acoustic signal is recognized in real time through the deep learning model, and the signal correlation analysis method is used to locate the location of the signal. Through the measurement of audio, a specific event can be diagnosed in real time and the location of the event can be located to shorten event processing time and improve event processing efficiency.

1~9:位置 1~9: location

100:定位裝置 100: positioning device

100A:主機 100A: host

110:處理器 110: processor

120:儲存裝置 120: storage device

130:感測器 130: Sensor

300A:示意圖 300A: schematic diagram

300B:曲線圖 300B: curve graph

310:預設路徑 310: Preset path

D:移動方向 D: Moving direction

S205~S230:特定音源的定位方法各步驟 S205~S230: The steps of the location method of a specific sound source

圖1A是依照本揭露一實施例的特定音源的定位裝置的方塊圖。 FIG. 1A is a block diagram of a device for locating a specific sound source according to an embodiment of the disclosure.

圖1B是依照本揭露另一實施例的特定音源的定位裝置的方塊圖。 FIG. 1B is a block diagram of a device for locating a specific sound source according to another embodiment of the disclosure.

圖2是依照本揭露一實施例的特定音源的定位方法流程圖。 FIG. 2 is a flowchart of a method for locating a specific sound source according to an embodiment of the disclosure.

圖3是依照本揭露一實施例的偵測特定音源的示意圖。 FIG. 3 is a schematic diagram of detecting a specific audio source according to an embodiment of the disclosure.

圖1A是依照本揭露一實施例的特定音源的定位裝置的方塊圖。圖1B是依照本揭露另一實施例的特定音源的定位裝置的方塊圖。 FIG. 1A is a diagram of a device for locating a specific sound source according to an embodiment of the present disclosure Block diagram. FIG. 1B is a block diagram of a device for locating a specific sound source according to another embodiment of the disclosure.

在圖1A中，定位裝置100包括處理器110、儲存裝置120以及感測器130。處理器110耦接至儲存裝置120以及感測器130。處理器110例如為中央處理單元(Central Processing Unit，CPU)、物理處理單元(Physics Processing Unit，PPU)、可程式化之微處理器(Microprocessor)、嵌入式控制晶片、數位訊號處理器(Digital Signal Processor，DSP)、特殊應用積體電路(Application Specific Integrated Circuits，ASIC)或其他類似裝置。 In FIG. 1A, the positioning device 100 includes a processor 110, a storage device 120 and a sensor 130. The processor 110 is coupled to the storage device 120 and the sensor 130. The processor 110 is, for example, a central processing unit (CPU), a physical processing unit (PPU), a programmable microprocessor (Microprocessor), an embedded control chip, and a digital signal processor (Digital Signal Processor). Processor, DSP), Application Specific Integrated Circuits (ASIC) or other similar devices.

儲存裝置120例如是任意型式的固定式或可移動式隨機存取記憶體(Random Access Memory，RAM)、唯讀記憶體(Read-Only Memory，ROM)、快閃記憶體(Flash memory)、硬碟或其他類似裝置或這些裝置的組合。儲存裝置120中儲存有一或多個程式碼片段，上述程式碼片段在被安裝後，會由處理器110來執行，以實現後述特定音源的定位方法。 The storage device 120 is, for example, any type of fixed or removable random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), flash memory (Flash memory), hard disk Dish or other similar device or combination of these devices. One or more code snippets are stored in the storage device 120. After the code snippets are installed, they will be executed by the processor 110 to implement the specific sound source location method described later.

感測器130用以在預設路徑上的多個位置中的每一個收集聲波訊號。在一實施例中，僅需設置一個感測器130即可，但並不以此為限。在預設路徑上預先設定多個位置，定位裝置100自初始位置開始沿著預設路徑移動，每移動到所設定的位置，定位裝置100便會透過感測器130來收集該處的聲波訊號，接著處理器110便可透過聲波訊號來找出發出特定音訊的位置(底下稱為特定音源位置)。 The sensor 130 is used to collect the acoustic signal at each of a plurality of positions on the predetermined path. In one embodiment, only one sensor 130 needs to be provided, but it is not limited to this. A plurality of positions are preset on the preset path. The positioning device 100 starts to move along the preset path from the initial position. Whenever it moves to the set position, the positioning device 100 collects the acoustic signal therethrough through the sensor 130 Then, the processor 110 can use the sound wave signal to find out the location where the specific audio signal is emitted (hereinafter referred to as the specific audio source location).

在另一實施例中，如圖1B所示，定位裝置100包括主機100A以及獨立設置的感測器130。在本實施例中，處理器110以及儲存裝置120為設置在同一主機100A內，而感測器130則為獨立設置的構件。感測器130在收集聲波訊號之後，透過有線或無線的傳輸方式將聲波訊號傳送至主機100A。在此，儲存裝置120中儲存有一或多個程式碼片段，上述程式碼片段在被安裝後，會由處理器110來執行，以實現後述特定音源的定位方法。 In another embodiment, as shown in FIG. 1B, the positioning device 100 includes a host 100A and a sensor 130 provided independently. In this embodiment, the processor 110 and the storage device 120 are arranged in the same host 100A, and the sensor 130 is an independent component. After the sensor 130 collects the sound wave signal, it transmits the sound wave signal to the host 100A through a wired or wireless transmission method. Here, one or more code snippets are stored in the storage device 120. After the code snippets are installed, they will be executed by the processor 110 to implement the specific sound source location method described later.

圖2是依照本揭露一實施例的特定音源的定位方法流程圖。圖3是依照本揭露一實施例的偵測特定音源的示意圖。在圖3中，上半部的示意圖300A用以表示感測器130在預設路徑310上的多個位置1~位置9來收集聲波訊號，下半部的曲線圖300B表示基於對應於位置1~位置9的代表係數所獲得的曲線圖。 FIG. 2 is a flowchart of a method for locating a specific sound source according to an embodiment of the disclosure. FIG. 3 is a schematic diagram of detecting a specific audio source according to an embodiment of the disclosure. In FIG. 3, the upper half of the schematic diagram 300A is used to show that the sensor 130 collects acoustic signals at a plurality of positions 1 to 9 on the preset path 310, and the lower half of the graph 300B is based on the corresponding position 1 ~The graph obtained by the representative coefficient at position 9.

請參照圖2及圖3，在步驟S205中，透過感測器130分別在預設路徑310上的多個位置1~位置9中的每一個收集聲波訊號。即，感測器130在移動至位置1時，會停下來進行取樣以收集在位置1的聲波訊號。接著，感測器130移動至位置2，停下來進行取樣以收集在位置2的聲波訊號，以此類推來收集在位置1~位置9的聲波訊號。 2 and 3, in step S205, the acoustic signal is collected at each of a plurality of positions 1 to 9 on the predetermined path 310 through the sensor 130. That is, when the sensor 130 moves to the position 1, it will stop and sample to collect the acoustic signal at the position 1. Then, the sensor 130 moves to the position 2 and stops to sample to collect the acoustic signal at the position 2, and so on to collect the acoustic signal at the position 1 to the position 9.

接著，在步驟S210中，對聲波訊號執行演算法，以取得多個訊號特徵。處理器110對在每一個位置所收集的聲波訊號執行演算法。在一實施例中，處理器110對聲波訊號進行梅爾倒頻譜(Mel-Frequency Cepstrum，MFC)計算，將獲得的梅爾倒頻譜係數、一階微分梅爾倒頻譜係數以及二階微分梅爾倒頻譜係數作為聲波訊號的訊號特徵。 Next, in step S210, an algorithm is performed on the acoustic signal to obtain multiple signal characteristics. The processor 110 executes an algorithm on the acoustic signal collected at each location. In one embodiment, the processor 110 performs Mel-Frequency Cepstrum (MFC) calculation on the acoustic signal, and calculates the obtained Mel-Frequency Cepstrum (MFC) The coefficients, the first-order differential Mel cepstral coefficients, and the second-order differential Mel cepstral coefficients are used as the signal characteristics of the acoustic signal.

之後，在步驟S215中，處理器110以所述訊號特徵作為深度學習模型的輸入，利用深度學習模型進行訊號辨識，以取得各位置的多個特定音訊訊號。深度學習模型例如為卷積神經網路(Convolutional Neural Network，CNN)模型。將步驟S210所擷取的訊號特徵作為CNN模型的輸入，而此CNN模型的輸出結果即為辨識結果。 After that, in step S215, the processor 110 uses the signal feature as the input of the deep learning model, and uses the deep learning model to perform signal recognition to obtain a plurality of specific audio signals at each location. The deep learning model is, for example, a Convolutional Neural Network (CNN) model. The signal feature extracted in step S210 is used as the input of the CNN model, and the output result of the CNN model is the identification result.

感測器110會基於一取樣頻率以及一取樣週期來取得多個取樣訊號，並以取樣訊號作為聲波訊號，例如，感測器130的取樣頻率為8KHz，取樣週期為1秒，取樣數為185筆。說明聲波訊號包括了185筆取樣訊號，在進行步驟S210、S215後，判定出這185筆取樣訊號中，屬於特定音訊者，將這些取樣訊號作為特定音訊訊號，進而執行自相關函數運算(容後詳述)。 The sensor 110 obtains a plurality of sampling signals based on a sampling frequency and a sampling period, and uses the sampling signals as the acoustic signal. For example, the sampling frequency of the sensor 130 is 8KHz, the sampling period is 1 second, and the number of samples is 185 pen. It shows that the sound wave signal includes 185 sampled signals. After performing steps S210 and S215, it is determined that the 185 sampled signals belong to the specific audio signal, and these sampled signals are regarded as the specific audio signal, and then the autocorrelation function calculation is performed (to be included later). Details).

特定音訊訊號的種類例如為漏水音、交通音、電器音、漏氣音或者機械運轉音，但並不以此為限。在一實施例中，於訓練過程中，利用漏水音、交通音、電器音、漏氣音、機械運轉音等經梅爾倒頻譜計算後的訊號特徵來訓練出CNN模型。據此，在輸入訊號特徵至CNN模型之後，便能夠透過CNN模型來識別出此訊號特徵為漏水音、交通音、電器音、漏氣音或機械運轉音。 The type of the specific audio signal is, for example, water leakage sound, traffic sound, electric appliance sound, air leakage sound, or mechanical operation sound, but it is not limited to this. In one embodiment, in the training process, the CNN model is trained by using signal characteristics calculated by Mel cepstrum, such as water leakage sound, traffic sound, electrical sound, air leakage sound, mechanical running sound, and so on. According to this, after inputting the signal characteristics to the CNN model, the CNN model can be used to identify the signal characteristics as water leakage sound, traffic sound, electrical sound, air leakage sound or mechanical operating sound.

例如，分別輸入電器音、交通音及漏水音的聲波訊號的訊號特徵，透過CNN模型辨識，獲得表1所示的辨識結果。 For example, input the signal characteristics of the sonic signals of electrical appliance sound, traffic sound and water leakage sound respectively, and identify them through the CNN model, and obtain the identification results shown in Table 1.

在表1中，實際類別為交通音的聲波訊號有53筆，透過CNN模型所獲得的預測類別有53筆為交通音。實際類別為電器音的聲波訊號有66筆，透過CNN模型所獲得的預測類別有66筆為電器音。實際類別為漏水音的聲波訊號有66筆，透過CNN模型所獲得的預測類別有64筆為漏水音，有2筆被誤判為電器音。由此可知，在本實施例中，利用CNN模型辨識聲波訊號的訊號特徵，準確率可達98.9%。 In Table 1, there are 53 acoustic signals with the actual category of traffic sounds, and 53 of the predicted categories obtained through the CNN model are traffic sounds. There are 66 acoustic signals with the actual category of electrical sounds, and 66 of the predicted categories obtained through the CNN model are electrical sounds. There are 66 acoustic signals with the actual category of leaking sound, 64 of the predicted categories obtained through the CNN model are leaking sounds, and 2 of them are misjudged as electrical sounds. It can be seen that, in this embodiment, the accuracy of identifying the signal characteristics of the acoustic signal using the CNN model can reach 98.9%.

利用深度學習模型判斷聲波訊號是否屬於特定音訊，在判定聲波訊號屬於特定音訊時，以該取樣訊號作為特定音訊訊號。在一實施例中，以檢測漏水位置作為主要目的，此時則選用辨識為漏水音的聲波訊號作為特定音訊訊號。之後，在步驟S220中，處理器110對在一個位置中所獲得的多個特定音訊訊號進行自相關函數(autocorrelation function)運算，以獲得多個自相關係數。之後，在步驟S225中，處理器110在這些自相關係數中選擇一代表值作為每一個位置對應的代表係數。處理器110利用在同一個位置取樣的多個特定音訊訊號之自相關係數來量化週期訊號的強弱。具體而言，在一實施例中，代表值係為自相關係數中的一最大值，亦即在同一個位置的所取樣的多個特定音訊訊號的多個自相關係數中，取出最大值來作為該位置的代表係數，但並不以此為限。 The deep learning model is used to determine whether the sound wave signal belongs to a specific audio signal. When it is determined that the sound wave signal belongs to a specific audio signal, the sampled signal is used as the specific audio signal. In one embodiment, the main purpose is to detect the location of the water leakage. In this case, the sound wave signal identified as the water leakage sound is selected as the specific audio signal. After that, in step S220, the processor 110 performs an autocorrelation function operation on a plurality of specific audio signals obtained in one position to obtain a plurality of autocorrelation coefficients. After that, in step S225, the processor 110 selects a representative value among these autocorrelation coefficients as the representative coefficient corresponding to each position. The processor 110 uses the autocorrelation coefficients of multiple specific audio signals sampled at the same position to quantize the periodic signal. The strength of the number. Specifically, in one embodiment, the representative value is a maximum value among the autocorrelation coefficients, that is, the maximum value is taken out of the multiple autocorrelation coefficients of multiple specific audio signals sampled at the same position. As the representative coefficient of the position, but not limited to this.

在圖3的示例中，以在預設路徑300沿著移動方向D所設置的位置1~位置9發出的聲響屬於特定音訊來進行說明。首先，在感測器130沿著移動方向D移動至位置1時，在位置1上處理器110透過CNN模型可判定感測器130所收集的聲波訊號屬於特定音訊，接著便針對在位置1所取樣的多個特定音訊訊號進行自相關函數運算，藉此來獲得對應於位置1的多個自相關係數。之後，在對應於位置1的多個自相關係數中取出最大值來作為位置1對應的代表係數。接著，感測器130沿著移動方向D移動至位置2，處理器110透過CNN模型可判定感測器130所收集的聲波訊號屬於特定音訊，接著便針對位置2所取樣的多個特定音訊訊號進行自相關函數運算，藉此來獲得對應於位置2的多個自相關係數。之後，在對應於位置2的多個自相關係數中取出最大值來作為位置2對應的代表係數。以此類推，在逐步移動至位置3~位置9時，可計算出位置3~位置9對應的代表係數。 In the example of FIG. 3, it is described that the sound emitted from the position 1 to the position 9 arranged along the movement direction D of the preset path 300 belongs to the specific audio. First, when the sensor 130 moves to position 1 along the moving direction D, at position 1, the processor 110 can determine that the sound wave signal collected by the sensor 130 belongs to a specific audio signal through the CNN model, and then it is directed to the position 1 The sampled multiple specific audio signals are subjected to an autocorrelation function operation, thereby obtaining multiple autocorrelation coefficients corresponding to position 1. After that, the maximum value among the multiple autocorrelation coefficients corresponding to position 1 is taken as the representative coefficient corresponding to position 1. Then, the sensor 130 moves along the moving direction D to position 2, and the processor 110 can determine through the CNN model that the sound wave signal collected by the sensor 130 belongs to a specific audio signal, and then a plurality of specific audio signals sampled at position 2 Perform autocorrelation function calculations to obtain multiple autocorrelation coefficients corresponding to position 2. After that, the maximum value among the multiple autocorrelation coefficients corresponding to position 2 is taken as the representative coefficient corresponding to position 2. By analogy, when moving to position 3~position 9 step by step, the representative coefficient corresponding to position 3~position 9 can be calculated.

之後，在步驟S230中，處理器110根據各位置的代表係數來找出特定音源位置。具體而言，在所述位置的代表係數中找出最大值，將作為最大值的代表係數對應的位置判定為特定音源位置。以圖3而言，位置5被判定為特定音源位置。 After that, in step S230, the processor 110 finds the specific sound source position according to the representative coefficient of each position. Specifically, the maximum value is found among the representative coefficients of the position, and the position corresponding to the representative coefficient that is the maximum value is determined as the specific sound source position. In Fig. 3, position 5 is determined as a specific sound source position.

在另一實施例中，處理器110亦可計算相鄰兩個位置的兩個代表係數的差值，當差值大於一閾值時，判定兩個代表係數中較大者所對應的位置為特定音源位置。即，於某一範圍(例如圖3所示的位置1~位置9)內所量測之聲波訊號皆判定為特定音訊，使用相鄰兩位置的代表係數相互比較，直至兩個代表係數的差值大於閾值，以最大的代表係數所在位置即為特定音源位置。 In another embodiment, the processor 110 may also calculate the difference between two representative coefficients at two adjacent positions, and when the difference is greater than a threshold, determine that the position corresponding to the larger of the two representative coefficients is a specific The location of the sound source. That is, the sound wave signals measured in a certain range (such as position 1 to position 9 shown in Figure 3) are judged to be specific audio signals, and the representative coefficients of two adjacent positions are compared with each other until the difference between the two representative coefficients If the value is greater than the threshold, the position with the largest representative coefficient is the specific sound source position.

所述特定音源的定位方法可以應用在漏水音、交通音、電器音、漏氣音或者機械運轉音的定位，在此並不限制。例如，在漏水音的定位中，利用音訊特徵進行辨識，當特定事件(地下管路發生洩漏)，因管內物質(液體或氣體)的壓力變化而產生特定音訊。 The specific sound source location method can be applied to the location of water leakage sound, traffic sound, electrical appliance sound, air leakage sound or mechanical operating sound, and it is not limited here. For example, in the location of water leakage sound, audio characteristics are used for identification. When a specific event (a leak occurs in an underground pipeline), a specific audio is generated due to the pressure change of the substance (liquid or gas) in the pipe.

綜上所述，本揭露僅需使用單一個感測器在多個位置量測聲波訊號，搭配深度學習模型來即時辨識所量測之聲波訊號，並利用發出特定音訊的來源處的訊號週期性強的特性來計算相鄰位置的自相關係數，藉此找出特定音源位置。據此，僅需使用單一個感測器便能夠找出發出特定音訊的位置(特定音源位置)。並且，透過量測聲波訊號能夠即時來診斷是否發生特定事件，並定位發生特定事件的位置，以縮短事件處理時間及提升事件處理效率。 To sum up, the present disclosure only needs to use a single sensor to measure the acoustic signal at multiple locations, and use the deep learning model to recognize the measured acoustic signal in real time, and use the periodicity of the signal from the source of the specific audio signal. It has strong characteristics to calculate the autocorrelation coefficients of adjacent positions to find out the specific sound source position. According to this, only a single sensor is needed to find out the location where the specific audio is emitted (the location of the specific audio source). In addition, by measuring the sound wave signal, it is possible to diagnose whether a specific event has occurred in real time, and to locate the location of the specific event, so as to shorten the event processing time and improve the efficiency of event processing.

Claims

A method for locating a specific sound source includes: collecting a sound wave signal through a sensor at each of a plurality of positions on a predetermined path; executing an algorithm on the sound wave signal to obtain a plurality of signal characteristics; The signal features are used as the input of a deep learning model, and the deep learning model is used to perform signal recognition to obtain a plurality of specific audio signals at each of the positions; to perform a check on the specific audio signals obtained at the same position Operate the autocorrelation function to obtain a plurality of autocorrelation coefficients; select a representative value among the autocorrelation coefficients as a representative coefficient corresponding to each of the positions; and find out according to the representative coefficient of each of the positions A specific sound source location.

For example, in the method for locating a specific sound source described in item 1 of the scope of patent application, the step of executing the algorithm on the sound wave signal to obtain the signal characteristics includes: performing a Mel cepstrum calculation on the sound wave signal to obtain The Mel cepstral coefficients, the first-order differential Mel cepstral coefficients, and the second-order differential Mel cepstral coefficients are used as the signal characteristics.

According to the method for locating a specific sound source as described in item 1 of the scope of patent application, the step of finding the position of the specific sound source according to the representative coefficient of each of the positions includes: among the plurality of representative coefficients of the positions Find the maximum value; and determine the position corresponding to the representative coefficient as the maximum value as the specific sound source position Set.

For the method for locating a specific sound source as described in item 1 of the scope of patent application, the step of finding the position of the specific sound source according to the representative coefficient of each of the positions includes: calculating two adjacent positions of two said positions. When the difference is greater than a threshold, the position corresponding to the larger of the two representative coefficients is determined to be the specific sound source position.

In the method for locating a specific sound source as described in item 1 of the scope of patent application, the deep learning model is a convolutional neural network model.

In the method for locating a specific sound source as described in the first item of the patent application, the sensor obtains a plurality of sampling signals based on a sampling frequency and a sampling period, and the sampling signals are used as the sound wave signal.

For example, in the method for locating a specific sound source as described in item 6 of the scope of patent application, the step of using the deep learning model for signal identification includes: using the deep learning model to determine whether the sound wave signal belongs to a specific sound signal; and determining the sound wave signal When it belongs to the specific audio signal, the sampled signals are used as the specific audio signals.

In the method for locating a specific sound source as described in item 1 of the scope of the patent application, the representative value is a maximum value among the autocorrelation coefficients.