CN117727331B

CN117727331B - Forest gunshot positioning method based on voice analysis

Info

Publication number: CN117727331B
Application number: CN202410179253.3A
Authority: CN
Inventors: 滕兵; 张根; 王崇瑞; 杨毅
Original assignee: Bainiao Data Technology Beijing Co ltd
Current assignee: Bainiao Data Technology Beijing Co ltd
Priority date: 2024-02-18
Filing date: 2024-02-18
Publication date: 2024-04-19
Anticipated expiration: 2044-02-18
Also published as: CN117727331A

Abstract

The application relates to the technical field of voice processing, and provides a forest gunshot positioning method based on voice analysis, which comprises the following steps: the method comprises the steps of obtaining sound signal data in a forest, converting the sound signal data of each sound signal acquisition point into a spectrogram, constructing local sound energy density according to local energy characteristics of each data point in the spectrogram, constructing sound pitch energy concentration degree according to the local sound energy density, obtaining high-energy frequency band concentrated coverage rate of each frame of sound signal in the spectrogram according to the sound high-energy concentration degree corresponding to each data point in the spectrogram, obtaining sound Mach wave suspected degree according to the high-energy frequency band concentrated coverage rate, obtaining sound Mach wave data sequences according to the sound Mach wave suspected degree, obtaining direction angles of sound in the forest by utilizing a multiple signal classification algorithm based on the sound Mach wave data sequences, and completing forest sound positioning according to the direction angles of the sound. According to the application, the direction angle of the gunshot is obtained through the Mach wave data sequence, so that the accuracy of forest gunshot positioning is improved.

Description

Forest gunshot positioning method based on voice analysis

Technical Field

The application relates to the technical field of voice processing, in particular to a forest gunshot positioning method based on voice analysis.

Background

Because noise is generated when wind in a forest blows through plants such as leaves and grass, and when the sound sounds, the noise of animals in the forest is usually generated, the problems all cause that the sound signal data collected by the microphone array contains more noise, the noise of the animals and other interference data, and when the sound source positioning algorithm is used for positioning the sound signals, for example, the MUSIC (Multiple Signal Classification ) algorithm, the noise of the animals is easily considered to be the sound source of the sound signals to be positioned, so that larger errors occur in the positioning result of the sound. Moreover, as a large number of trees and vegetation exist in the forest, the gunshot signals belong to high-frequency signals, the signals are easily absorbed by air, trees, vegetation and the like in the process of propagation, and the sound of animals and the gunshot signals form mixed signals, the gunshot signals acquired by the microphone array lose excessive available information, and the accuracy of the gunshot positioning result is affected.

Disclosure of Invention

The application provides a forest gunshot positioning method based on voice analysis, which aims to solve the problem of low accuracy of positioning the gunshot in the forest through voice analysis, and adopts the following technical scheme:

one embodiment of the application provides a forest gunshot positioning method based on voice analysis, which comprises the following steps:

Acquiring sound signal data in a forest, and converting the sound signal data of each array element microphone in a microphone array of each sound signal acquisition point in the forest into a spectrogram;

Constructing a tone pitch energy concentration degree according to the local energy characteristics of each data point in the spectrogram corresponding to each array element microphone; constructing a high-energy frequency band and a low-energy frequency band of each column of data points in the spectrogram according to the high-energy concentration degree of sound corresponding to each data point in the spectrogram corresponding to each array element microphone; calculating the concentrated coverage rate of the high-energy frequency band according to the high-energy frequency band and the low-energy frequency band of each column of data points in the spectrogram corresponding to each array element microphone; calculating the sound Mach wave suspected degree based on the high-energy frequency band concentrated coverage rate corresponding to each column of data points in the spectrogram corresponding to each array element microphone;

And acquiring a Mach wave data matrix corresponding to each sound signal acquisition point in the forest according to the gun sound Mach wave suspected degree of all columns in the spectrogram corresponding to each array element microphone, and acquiring a positioning result of the direction angle of the gun sound in the forest based on the Mach wave data matrix.

Preferably, the method for constructing the sound pitch energy concentration degree according to the local energy characteristics of each data point in the spectrogram corresponding to each array element microphone comprises the following steps:

taking energy values of all data points in a spectrogram corresponding to each array element microphone as input, and acquiring local sound energy density of each data point in the spectrogram by adopting a density peak clustering algorithm;

Constructing a local window with a preset size by taking each data point in a spectrogram corresponding to each array element microphone as a center, and calculating the energy distribution similarity concentration degree according to the local sound energy density and energy value difference between different data points in the local window of each data point in the spectrogram;

and calculating the sound high-energy concentration degree of each data point according to the energy distribution similar concentration degree corresponding to the local window of each data point in the spectrogram corresponding to each array element microphone.

Preferably, the specific method for calculating the similarity concentration degree of the energy distribution according to the local sound energy density and the energy value difference between different data points in the local window of each data point in the spectrogram is as follows:

For a local window of each data point in a spectrogram corresponding to each array element microphone, taking the product of the local energy density and the energy value of each data point in the local window as a first aggregation coefficient, and taking the absolute value of the difference between the first aggregation coefficients corresponding to any two data points in the local window as a second aggregation coefficient;

Taking Euclidean distance between any two data points in the local window as a third aggregation coefficient, and taking the reciprocal of the sum of the product of the second aggregation coefficient corresponding to any two data points in the local window and the third aggregation coefficient and the preset parameter as the energy distribution similarity aggregation degree between any two data points in the local window.

Preferably, the specific method for calculating the sound high-energy concentration degree of each data point according to the energy distribution similarity concentration degree corresponding to the local window of each data point in the spectrogram corresponding to each array element microphone comprises the following steps:

for a local window of each data point in the spectrogram corresponding to each array element microphone, taking the average value of the accumulated result of the energy distribution similarity concentration degree between any two data points in the local window on the local window as a first high energy coefficient; taking the average value of the energy values of all the data points in the local window as a second high energy coefficient, and taking the product of the first high energy coefficient and the second high energy coefficient of the local window as the sound high energy concentration degree of each data point in the spectrogram.

Preferably, the method for constructing the high-energy frequency band and the low-energy frequency band of each column of data points in the spectrogram according to the high-energy concentration degree of sound of each data point in the spectrogram corresponding to each array element microphone comprises the following steps:

A threshold segmentation algorithm is adopted to obtain segmentation thresholds of the sound high-energy aggregation degree of all data points in a spectrogram corresponding to each array element microphone, the data points with the sound high-energy aggregation degree larger than or equal to the segmentation thresholds in the spectrogram are used as sound high-energy aggregation points, and the data points with the sound high-energy aggregation degree smaller than the segmentation thresholds in the spectrogram are used as sound low-energy aggregation points;

For each column of data points in the spectrogram corresponding to each array element microphone, setting the energy value of the sound high-energy aggregation point in each column of data points to be 1, setting the energy value of the sound low-energy aggregation point in each column of data points to be 0, taking a sequence consisting of data with updated energy values of each column of data points as a frequency band analysis sequence, adopting a connected domain analysis algorithm to acquire a connected domain of the frequency band analysis sequence, taking the connected domain with elements of 1 in the frequency band analysis sequence as a high-energy frequency band, and taking the connected domain with elements of 0 in the frequency band analysis sequence as a low-energy frequency band.

Preferably, the method for calculating the concentrated coverage rate of the high-energy frequency band according to the high-energy frequency band and the low-energy frequency band of each column of data points in the spectrogram corresponding to each array element microphone comprises the following steps:

For each column of data points in the spectrogram corresponding to each array element microphone, taking a mapping result of standard deviation of energy values of all data points in any one high-energy frequency band in each column of data points as a first concentration coefficient, and taking an accumulation result of the first concentration coefficient on all high-energy frequency bands of each column of data points as a second concentration coefficient; taking the mapping result of the standard deviation of the energy values of all the data points in any one low-energy frequency band in each column of data points as a third concentration coefficient, taking the accumulation result of the third concentration coefficient on all the low-energy frequency bands in each column of data points as a fourth concentration coefficient, and taking the product of the second concentration coefficient and the fourth concentration coefficient as the speech spectrum frequency band energy concentration degree of each column of data points;

taking the average value of the energy values of all the data points in any one high-energy frequency band in each column of data points as a first difference coefficient, taking the average value of the energy values of all the data points in any one low-energy frequency band in each column of data points as a second difference coefficient, taking the mapping result of the absolute value of the difference between the first difference coefficient and the second difference coefficient as a third difference coefficient, and taking the accumulated result of the third difference coefficient on each column of data points as the frequency band energy difference coefficient of each column of data points;

Calculating high energy distribution concentration according to the spectrum band energy concentration of each column of data points in the spectrogram corresponding to each array element microphone, the band energy difference coefficient and the band distribution characteristics of each column of data points; and acquiring the concentrated coverage rate of the high-energy frequency band according to the concentrated degree of the high-energy distribution of each column of data points in the spectrogram corresponding to each array element microphone and the frequency bandwidth of the high-energy frequency band.

Preferably, the specific method for calculating the high energy distribution concentration according to the spectrum band energy concentration of each column of data points, the band energy difference coefficient and the band distribution characteristic of each column of data points in the spectrogram corresponding to each array element microphone comprises the following steps:

For each column of data points in the spectrogram corresponding to each array element microphone, respectively taking the maximum value of the frequency difference values between all data points in each high-energy frequency band and each low-energy frequency band in each column of data points as the frequency bandwidth of each high-energy frequency band and each low-energy frequency band;

Taking the ratio of the spectrum band energy concentration of each column of data points to the band energy difference coefficient as a numerator, taking the sum of the band widths of all low-energy bands between two adjacent high-energy bands in each column of data points as a first distribution coefficient, taking the sum of the average value of the accumulation result of the first distribution coefficient on each column of data points and a preset parameter as a denominator, and taking the ratio of the numerator to the denominator as the high-energy distribution concentration of each column of data points.

Preferably, the method for obtaining the concentrated coverage rate of the high-energy frequency band according to the concentrated rate of the high-energy distribution and the frequency bandwidth of the high-energy frequency band of each column data point in the spectrogram corresponding to each array element microphone comprises the following steps:

And for each column of data points in the spectrogram corresponding to each array element microphone, calculating the average value of the frequency bandwidths of all the high-energy frequency bands in each column of data points, and taking the product of the average value and the high-energy distribution concentration of each column of data points as the high-energy frequency band concentration coverage rate of each column of data points.

Preferably, the method for calculating the gun sound mach wave suspected degree based on the high energy frequency band concentrated coverage rate corresponding to each column of data points in the spectrogram corresponding to each array element microphone comprises the following steps:

for each column of data points in a spectrogram corresponding to each array element microphone, taking the sum of the frequency bandwidths of all high-energy frequency bands in each column of data points as frequency band coverage rate, acquiring the frequency band coverage rate of all columns in a short time interval taken by each column of data points as a center column, taking the serial number of each column in the short time interval as an abscissa, taking the frequency band coverage rate of each column as an ordinate, and taking the slope of a fitting straight line of the data points determined by the abscissa and the ordinate as the short-time high-energy coverage rate of each column of data points;

Taking the square of the difference between the short-time frequency band coverage rate change index of each column and the short-time frequency band coverage rate change index of the center column in the short-time interval taking each column of data points as the center column as a first suspected coefficient, taking the product of the average value of the accumulated results of the first suspected coefficient in the short-time interval taking each column of data points as the center column and the mapping result of the short-time frequency band coverage rate change index of each column of data points as a second suspected coefficient, taking the sum of the second suspected coefficient and a preset parameter as a denominator, taking the high-energy frequency band concentrated coverage rate corresponding to each column of data points as a numerator, and taking the ratio of the numerator and the denominator as the gun sound Mach wave suspected degree of each column of data points.

Preferably, the method for obtaining the mach wave data matrix corresponding to each sound signal acquisition point in the forest according to the mach wave suspected degree of the gunshot in all columns in the spectrogram corresponding to each array element microphone, and obtaining the direction angle of the gunshot in the forest based on the mach wave data matrix comprises the following steps:

Taking gun sound Mach wave suspected degrees of all columns in a spectrogram corresponding to each array element microphone as input, acquiring segmentation results of all columns in the spectrogram by using a maximum inter-class variance algorithm, taking the segmentation results of all columns as input of a pitch synchronous superposition algorithm, acquiring a spliced data sequence of the spectrogram, taking each spliced data sequence as one element in a set, taking a set formed by all spliced data sequences as a Mach wave data sequence set, and taking a sequence formed by sequencing all elements in the Mach wave data sequence set according to a time ascending sequence as the Mach wave data sequence of each array element microphone;

And taking the Mach wave data sequence of each array element microphone as one row element in the matrix, taking the matrix formed by the Mach wave data sequences of all array element microphones in the microphone array as a gunshot Mach wave data matrix of the microphone array of each sound signal acquisition point, taking the K and other division results of the gunshot Mach wave data matrix as the input of a MUSIC algorithm, acquiring the Mach wave direction angle sequence of the microphone array of each sound signal acquisition point, taking the Mach wave direction angle sequence as the input of an LSTM neural network model, and acquiring the positioning result of the direction angle of the gunshot in the forest of each sound signal acquisition point.

The beneficial effects of the application are as follows: the method has the advantages that the accuracy of identifying Mach waves in sound signals is improved by analyzing local distribution characteristics and aggregation characteristics of sound energy in sound signal data, the accuracy of analyzing characteristics of Mach wave energy of sound signals in sound signals is improved by reflecting energy differences of the local signal characteristics in the sound signal data through the sound pitch energy aggregation, further, high-energy frequency bands and low-energy frequency bands in a spectrogram corresponding to the sound signal data are constructed according to the sound pitch energy aggregation, high-energy frequency band concentrated coverage rate is constructed through the distribution characteristic differences of the high-energy frequency bands and the low-energy frequency bands, the accuracy of identifying Mach waves in sound signals is improved by reflecting the sound signal characteristics of Mach waves in the sound signal data through the high-energy frequency band concentrated coverage rate, the direction angles of the sound waves in forests are obtained by utilizing MUSIC algorithm and LSTM neural network model based on the sound Mach wave suspicious degree, and the accuracy of analyzing sound positioning of sound signals in forests is improved.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the application, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.

Fig. 1 is a flow chart of a forest gunshot positioning method based on voice analysis according to an embodiment of the present application;

Fig. 2 is a schematic diagram of an implementation process for obtaining a sound localization result in a forest according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Referring to fig. 1, a flowchart of a forest gunshot positioning method based on voice analysis according to an embodiment of the application is shown, and the method includes the following steps:

Step S001, acquiring sound signal data in a forest.

Dividing the region of the forest intoThe application relates to a device for detecting the sound signals of a monitoring area, which comprises (100 tested values of the size) monitoring areas, wherein each monitoring area is internally provided with a sound signal acquisition point, each acquisition point is respectively provided with a camera and a microphone array, and sound signals near the area are acquired in real time by utilizing the microphone arraysThe acquisition duration in the sound signal acquisition period of each array element microphone is recorded as/>The sampling frequency and the quantization length of the sound signal respectively take empirical values of/>And/>Wherein the number of array element microphones/>Acquisition duration/>The application is set by an implementer, and the setting is as follows: /(I)。

Setting a plane coordinate system by taking the center of a forest as an origin, and acquiring a geometric array information vector of each microphone array, wherein each element in the geometric array information vector respectively represents the projection coordinates and the direction angles of each array element microphone in the microphone array in the plane coordinate system.

Thus, the sound signal data of each sound collection point in the forest is obtained.

Step S002, the sound signal data of each array element microphone of the microphone array of each sound signal acquisition point is converted into a spectrogram, the local sound energy density is constructed according to the local energy characteristics of each data point in the spectrogram, and the sound pitch energy concentration degree is constructed according to the local sound energy density.

The sound signal generated by the gunshot is recorded as gunshot signal, and the gunshot signal mainly consists of a muzzle wave and Mach wave, wherein the duration of the muzzle wave is generally hundreds of milliseconds to seconds, the muzzle wave is formed by air flow ejected from a muzzle after a bullet is ejected from a muzzle, and the duration of the muzzle wave is generally as followsMach waves are sound waves generated by intense friction with air when a bullet flies at supersonic speeds. Because the muzzle wave is generated inside the gun, sound can diffuse from the gun barrel port to all directions, so that the muzzle wave loses a large amount of energy when the muzzle wave is just transmitted, meanwhile, the muzzle wave can be absorbed by trees, vegetation and atmosphere in the forest in the transmission process, when the silencer is arranged at the muzzle or a shooter is far away from the microphone array, the microphone array can hardly detect the muzzle wave in a muzzle signal, and Mach waves are generated at a certain point on the flight trajectory of a bullet, and almost all the muzzle waves can be generated in the whole flight process of the bullet. Therefore, the Mach wave signals are separated from the sound signal data collected by the microphone array, and the direction of the gun sound is positioned based on the separated Mach waves.

Further, for the sound signal data collected by each sound signal collection point, the sound signal data collected by each array element in the microphone array in each sound signal collection point is analyzed, specifically, for example, the first in the forest/>, In a microphone array of a sound signal acquisition PointSound signal data collected by each array element microphone is recorded as/>。

Further, it willConverting into a spectrogram, wherein the frame length is 10ms, the frame shift is 4ms, and the Hamming window is used as a window function, pair/>The framing and windowing processing is carried out, the short-time Fourier transform is utilized to calculate the frequency spectrum, the specific implementation process of the transformation of the spectrogram is a known technology, and the repeated description is omitted. In the spectrogram, the abscissa represents time, i.e. frame number, and the ordinate represents frequency, and the/>, in the spectrogramLine/>Column data points are recorded as/>And/>Energy value representation in a spectrogram/>In/>Frequency in sound signal data of frame is/>Is provided for the energy of the sound signal data.

Further, the gunshot signal usually has a large energy, but each array element in the microphone array is interfered by noise during the process of collecting the sound signal data, so thatThe energy value of the data point corresponding to each frame of sound signal data in the spectrogram is deviated from the actual energy value to a certain extent, so that the distribution of the data points with high sound energy in the spectrogram is deviated from the actual distribution, and therefore, the data points suspected to be high sound energy gathering areas are screened out according to the distribution condition of the data points in the spectrogram, and the interference of noise on sound energy distribution characteristics of the sound signal in the spectrogram is reduced.

Specifically, the input isAcquiring clustering results of all data points in a spectrogram by adopting a density peak clustering algorithm, and taking the average value of energy values of all the data points in each cluster in the clustering results as the local sound energy density corresponding to all the data points in each cluster; build/>, centered on each data point in the spectrogram（/>The specific calculation process of the density peak clustering algorithm of the local window with the size of the empirical value 3) is a known technology and will not be described in detail. According to the difference of the local sound energy density and the energy value between different data points in a local window corresponding to each data point in the spectrogram, calculating the energy distribution similarity concentration degree, wherein a specific calculation formula is as follows:

In the method, in the process of the invention, Representation/>And/>The energy distribution between them is similar in concentration,/>And/>Respectively express/>In the local window of/>Sum/>Data points; /(I)And/>Respectively express/>In the local window of/>Sum/>Local acoustic energy density of data points; /(I)And/>Respectively express/>In the local window of/>Sum/>Energy values of the data points; /(I)Representation/>And/>Euclidean distance between/(The regulating parameter is expressed, and the empirical value is 0.01.

If it is、/>The larger the corresponding local sound energy density and sound energy value, the calculated first aggregation factor/>And/>The larger the value of (2) >, if、/>With the same local sound energy density and energy value, the calculated second coefficient of aggregation/>The smaller the value of (C) is, while/>And/>The smaller the Euclidean distance between them, the third aggregation coefficient/>, calculatedThe smaller the value of (i) is, i.e. calculated/>And/>Energy distribution similarity concentration betweenThe larger the value of (2) is, the more/>, expressed in the spectrogramAnd/>Is more similar to the sound energy collection features of (a).

Further, the sound high-energy concentration degree of each data point is calculated according to the energy distribution similarity concentration degree corresponding to each data point in the spectrogram, and a specific calculation formula is as follows:

In the method, in the process of the invention, Representing data points/>, in the spectrogramCorresponding high-energy sound concentration; /(I)Representation/>AndThe energy distribution between them is similar in concentration,/>And/>Respectively express/>In the local window of/>Sum/>Data points; /(I)Representation/>An average of the energy values of all data points in the local window; /(I)Representation/>The number of data points in a local window of (a).

In the pattern of the whisperThe similarity degree of the sound energy gathering characteristics among the data points in the local window is larger, and the calculated first high energy coefficient/>The larger the value of (2) is, while in the spectrogram/>If the energy value of the data point in the local window is larger, the calculated second highest energy coefficient/>The larger the value of (2) is, namely the data point/>, in the calculated spectrogramCorresponding high energy concentration of sound/>The larger the value of (2) is, the more/>, in the spectrogram is representedPossibly in a high sound energy concentration area.

So far, the sound high-energy concentration degree corresponding to each data point in the spectrogram is obtained.

Step S003, the high-energy frequency band concentrated coverage rate of each frame of sound signal in the spectrogram is obtained according to the sound high-energy aggregation degree corresponding to each data point in the spectrogram, and the gun sound Mach wave suspected degree is obtained according to the high-energy frequency band concentrated coverage rate.

In forests, the occurrence of gunshot can cause part of animals to sound, which can causeWill contain part of the animal's acoustic signal and therefore need pair/>The Mach wave signals in the non-gunwave signals are subjected to smoothing processing, so that the interference of animal sound signals is reduced. The Mach wave generated by the bullet belongs to a typical pulse signal, has the characteristic of concentrated energy distribution in a time-frequency space, and the total energy concentrated area occupies a larger time-frequency space; meanwhile, the gunshot signal generally has larger energy, so that the energy of the sound signal of each frame in the Mach wave in the spectrogram is concentrated and distributed in one high-energy frequency band, the sound signal of the animal belongs to an intermittent signal, and the method has the characteristic of time-frequency sparseness, namely a plurality of time-frequency regions, the total energy concentrated region occupies a smaller space in the whole time-frequency, the energy of the sound signal of each frame in the animal in the spectrogram is distributed in a plurality of high-energy frequency bands, and the coverage rate of the sound signal of each frame in the Mach wave in the spectrogram is generally larger than the coverage rate of the sound signal of each frame in the spectrogram.

Specifically, the input isThe method comprises the steps of obtaining a segmentation threshold value of the sound high-energy aggregation degree corresponding to all data points in a spectrogram by adopting an Ojin threshold segmentation algorithm, taking the data point with the sound high-energy aggregation degree larger than the segmentation threshold value in the spectrogram as a sound high-energy aggregation point, and taking the data point with the sound high-energy aggregation degree smaller than the segmentation threshold value in the spectrogram as a sound low-energy aggregation point.

Further, for each frame of sound signal in the spectrogram, namely, each column of data points in the spectrogram, setting the energy value of the sound high-energy aggregation point in each column of data points to be 1, setting the energy value of the sound low-energy aggregation point in each column of data points to be 0, and taking a sequence formed by data with updated energy values of each column of data points as a frequency band analysis sequence; the input is a frequency band analysis sequence, a connected domain analysis algorithm is adopted to obtain a connected domain of the frequency band analysis sequence, the connected domain with elements of 1 in the frequency band analysis sequence is used as a high-energy frequency band, the connected domain with elements of 0 in the frequency band analysis sequence is used as a low-energy frequency band, and the difference value between the maximum value and the minimum value of frequencies corresponding to all data points in each high-energy frequency band and each low-energy frequency band is used as the frequency bandwidth of the high-energy frequency band and the low-energy frequency band.

Further, toCorresponding/>, in the spectrogramThe frame sound signal is exemplified by the/>, i.e. the speech patternColumn data points according to/>The high energy frequency band and the low energy frequency band corresponding to the column data points are used for respectively calculating the energy concentration degree of the speech spectrum frequency band and the energy difference coefficient of the frequency band, and a specific calculation formula is as follows:

In the method, in the process of the invention, And/>Respectively represent the/>, in the spectrogramThe concentration of the spectrum band energy and the band energy difference coefficient of the column data points; /(I)And/>Respectively represent the/>, in the spectrogramColumn data points corresponding to the/>High energy band and/>Standard deviation of energy values for all data points in the low energy frequency band; /(I)And/>Respectively represent the/>Column data points corresponding to the/>High energy band and/>An average of energy values for all data points in the low energy frequency band; /(I)And/>Respectively represent the firstThe number of high energy bands and the number of low energy bands corresponding to the column data points.

The first of the speech patternsThe degree of energy value dispersion of the data points in the high energy frequency band and the low energy frequency band corresponding to the column data points is smaller, and the calculated first concentration coefficient/>And a third concentration coefficient/>The larger the value of (2), the calculated second concentration factor/>And fourth concentration coefficient/>The smaller the value of (2), i.e. the calculated spectrogram isSpectral band energy concentration of column data points/>The smaller the value of (2); in the spectrogram of the first itemThe larger the energy values of the data points in the high energy frequency band and the low energy frequency band corresponding to the column data points are, the calculated first difference coefficient/>And a second coefficient of differenceThe larger the value of (2) is, while the/>, in the spectrogramThe larger the difference of the energy values of the data points in the high energy frequency band and the low energy frequency band corresponding to the column data points is, the third difference coefficient/>, which is calculatedThe larger the value of (i) is, the calculated/>, in the spectrogramFrequency band energy difference coefficient of column data points/>The greater the value of (2).

Further, the high energy distribution concentration is calculated according to the spectrum band energy concentration, the band energy difference coefficient and the band distribution characteristic of each column of data points in the spectrogram, and a specific calculation formula is as follows:

In the method, in the process of the invention, Representing the/>, in the spectrogramHigh energy distribution concentration of column data points; /(I)And/>Respectively represent the/>, in the spectrogramFrequency band energy concentration and frequency band energy difference coefficient of the spectrum of the column data points,/>Representation/>And/>Is a ratio of (2); /(I)AndRespectively represent the spectrogram No./>Column data points corresponding to the/>High energy band and/>The frequency band of the high energy is selected,For the first distribution coefficient, the size is equal to the spectrogram No./>Column data points corresponding to the/>High energy band and the firstThe sum of the bandwidths of all low energy bands between the high energy bands; /(I)Represents the/>The number of high energy bands corresponding to column data points; /(I)The regulating parameter is expressed, and the empirical value is 0.01.

If the spectrogram is the firstThe distribution of all high energy frequency bands corresponding to the column data points is concentrated, and then the distribution is calculatedThe smaller the value of (2) while the spectrogram is/>The more obvious the two polarization distributions between the high energy band and the low energy band corresponding to the column data points are, the calculated/>The larger the value of/>The smaller the value of (i) is, i.e. calculated/>The larger the value of (2), the more/>, in the spectrogramHigh energy distribution concentration of column data points/>The larger the value of (2) is, the more/>, the spectrogram is representedData points with high sound energy values are centrally distributed in a larger frequency band.

Further, the high energy frequency band concentrated coverage rate is obtained according to the high energy distribution concentration degree and the characteristics of the high energy frequency band corresponding to each column of data points in the spectrogram, and a specific calculation formula is as follows:

In the method, in the process of the invention, Representing the spectrogram No./>High energy band concentrated coverage for column data points; /(I)Representing the/>, in the spectrogramHigh energy distribution concentration of column data points; /(I)Representing the/>, in the spectrogramColumn data points corresponding to the/>Frequency bandwidth of the high energy frequency band; /(I)Represents the/>The number of high energy bands corresponding to column data points.

The first of the speech patternsThe frequency bandwidth of all the high-energy frequency bands corresponding to the column data points is larger, and the frequency bandwidth is calculatedThe larger the value of (2) is, while the/>, in the spectrogramHigh energy distribution concentration of column data points is larger, then/>The larger the value of (i) is, the calculated spectrogram (i.e.)High energy band concentrated coverage of column data points/>The larger the value of (2) is, the more/>, the expression in the spectrogram isThe more concentrated the energy distribution of the column data points.

Furthermore, reflection and diffraction phenomena can occur more easily when the high-frequency sound signals encounter obstacles such as trees and vegetation in the propagation process, the high-frequency sound signals are absorbed by air and the obstacles more easily, so that the energy of the high-frequency sound signals has a larger attenuation degree, mach waves in the gunsound signals are wholly attributed to the high-frequency signals, the energy of the Mach waves is generally concentrated and distributed in a frequency band, the high-energy frequency band concentrated coverage rate of the sound signals of each frame of the Mach waves in a spectrogram can have a remarkable descending trend along with time, and the descending degree between the high-energy frequency band concentrated coverage rates in the spectrogram is relatively close.

Further, toCorresponding/>, in the spectrogramThe column data points are taken as an example, and the/>, in the spectrogram is taken as the exampleSetting short time interval/>, with column data points as the centerIs marked as/>(1 /)Short time interval, wherein/>、/>All take the empirical value 5, will/>Middle/>The sum of the frequency bandwidths of all the high-energy frequency bands in the spectrogram of the column data points is respectively taken as the firstBand coverage of column data points will be/>The sequence consisting of the frequency band coverage of the column data points is used as a frequency band coverage sequence, the frequency band coverage sequence is input, a least square method is adopted to obtain a fitting straight line of the frequency band coverage sequence, and the slope of the fitting straight line is used as the/>Short-time band coverage change index/>, of column data pointsExpressed/>Middle/>High energy band concentrated coverage in column data points is in short time interval/>Trend of change in the inner part.

Further, according to the high-energy frequency band centralized coverage rate and the short-time frequency band coverage rate change index, the gun sound Mach wave suspected degree of each row of data points in the spectrogram is calculated, and a specific calculation formula is as follows:

In the method, in the process of the invention, Representing the/>, in the spectrogramGun sound Mach wave suspicion of column data points; /(I)Representing the spectrogram No./>High energy band concentrated coverage for column data points; /(I)And/>Respectively represent the spectrogram No./>Column data points and/>Short-time band coverage rate change index of column data points; /(I)And/>Respectively represent the selected spectrograms by the/>Setting the column data points as the center and setting the first/>, in a short time intervalThe number of columns on the left and right sides of the column; /(I)The regulating parameter is expressed, and the empirical value is 0.01.

The first of the speech patternsThe more concentrated the energy distribution of the column data points, the calculated/>The larger the value of (2) is, while the/>, in the spectrogramShort-term band coverage change index of column data points is in short-term interval/>With obvious descending trend, the calculated first suspected coefficient/>The smaller the value of (1), the more/>, in the spectrogramThe more the trend of the high-energy frequency band centralized coverage rate between the column data points and each adjacent column data point is close to the trend, the second suspected coefficient is calculatedThe smaller the value of (2), the greater the/>, in the calculated spectrogramGun mach wave suspicion of column data points/>The greater the value of (2).

So far, the gun sound Mach wave suspected degree of each column of data points in the spectrogram is obtained.

And S004, acquiring a sound Mach wave data sequence according to the sound Mach wave suspected degree, acquiring a direction angle of sound in the forest by utilizing a multiple signal classification algorithm based on the sound Mach wave data sequence, and completing the positioning of the sound of the forest according to the direction angle of the sound.

Will beThe sequences consisting of gun sound Mach wave plausibility of all column data points in the corresponding spectrograms are used as Mach wave plausibility data sequences, input into the Mach wave plausibility data sequences, and the maximum inter-class variance algorithm is adopted to obtain the segmentation threshold/>The specific calculation process of the maximum inter-class variance algorithm is a known technology, and will not be described in detail.

Further, each column of data points in the spectrogram corresponds toThus screening out/>The value of the suspected degree of the medium gunwave Mach wave is larger than the segmentation threshold/>Is divided into a plurality of sets according to the continuity of frames, for example, the sound signals of the 2 nd, 3 rd and 4 th frames in the spectrogram correspond to sound Mach wave suspected values which are larger than the segmentation threshold valueDividing the sound signals of the 2 nd, 3 rd and 4 th frames into a set, respectively taking the sound signals of all frames in each set as PSOLA (Pitch Synchronous Overlap Add) inputs, and outputting a spliced data sequence of the sound signals in each set, wherein the specific calculation process of the PSOLA algorithm is a known technology and will not be described in detail. All the obtained spliced data sequences are combined into a set to be used as Mach wave data sequence set/>，/>Is a concatenated data sequence. Will beThe sequence in which the values of the data points in all the data sequences are organized in ascending order of time is taken as the/>/>, In a microphone array of a sound signal acquisition PointGun sound Mach-number data sequence/>, of each array element microphone。

Further, the first step isThe sound Mach-Zehnder data sequence of each array element microphone in the microphone array of each sound signal acquisition point is taken as a column vector in a matrix, and the/>The matrix formed by the gun sound Mach wave data sequences of all array element microphones in the microphone array of each sound signal acquisition point is used for obtaining the/>Gun sound Mach wave data matrix/>, of microphone array of individual sound signal acquisition pointsWherein/>Representing the transpose of the matrix,/>Each row of elements in (1) represents a/>The gun mach-zenith data sequence for each array element microphone in the microphone array for each acoustic signal acquisition point.

Further, during the process of flying at supersonic speed, the bullet generates Mach waves at every point on the flight trajectory, namely, the bullet generates Mach waves at every moment, and because the duration of the Mach waves is shorter and the propagation speed is faster, all Mach waves generated by the bullet in a shorter time are approximate to the direction angle formed by the same microphone array in a plane coordinate system. Thus, it willAccording to columns/>Equal divisions, wherein/>Take the empirical value of 20 if/>Cannot meet/>Equal division, then at/>Adding 0 element columns column by column at the rightmost side until/>According to columns/>Equally dividing; will/>Division result and/>The geometric array information vector of each microphone array is used as the input of a MUSIC algorithm, wherein the prior signal number is 3, the value of the array element number is 8, and the/>All direction angles corresponding to each divided submatrix are taken as bullet positions/>, and the average value of all direction angles corresponding to each submatrix is taken as bullet positions/>Mach waves and/>, generated in the time of each sub-matrix being dividedThe specific calculation process of the MUSIC algorithm is a known technology, and will not be described in detail.

Further, the MUSIC algorithm is commonly used to obtainIndividual mach wave direction angles, will/>The sequence of the Mach wave direction angles in time ascending order is taken as the first/>Mach wave direction angle sequences of microphone arrays of individual acoustic signal acquisition points. Will/>The Mach wave direction angle sequence of the microphone array of each sound signal acquisition point is used as the input of an LSTM (Long Short-Term Memory) neural network model, the flight trajectory direction angle of a bullet is output, and the flight trajectory direction angle is used as the first flight trajectory direction angleBullet flight trajectory estimation direction angle/>, of microphone array of individual sound signal acquisition pointsThe optimization algorithm is a random gradient descent method, the loss function is a mean square error, the training of the LSTM neural network model is a known technology, and the specific process is not repeated.

Further, since the bullet is far from the firing point, the bullet flight trajectory is estimated to be the direction angleOpposite direction/>As a result of positioning the direction angle of the sound in the forest, a specific implementation process for obtaining the result of positioning the sound in the forest is shown in fig. 2.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. The above description is only of the preferred embodiments of the present application and is not intended to limit the application, but any modifications, equivalent substitutions, improvements, etc. within the principles of the present application should be included in the scope of the present application.

Claims

1. The forest gunshot positioning method based on the voice analysis is characterized by comprising the following steps of:

2. The method for locating forest gunsound based on voice analysis according to claim 1, wherein the method for constructing the voice pitch energy concentration degree according to the local energy characteristics of each data point in the spectrogram corresponding to each array element microphone is as follows:

3. The method for locating forest gunsound based on voice analysis according to claim 2, wherein the specific method for calculating the similarity concentration degree of energy distribution according to the difference of local sound energy density and energy value between different data points in the local window of each data point in the spectrogram is as follows:

4. The method for locating forest gunsound based on voice analysis according to claim 2, wherein the specific method for calculating the sound high-energy concentration degree of each data point according to the energy distribution similarity concentration degree corresponding to the local window of each data point in the spectrogram corresponding to each array element microphone is as follows:

5. The method for locating forest gunshot based on voice analysis according to claim 1, wherein the method for constructing the high energy frequency band and the low energy frequency band of each column data point in the spectrogram according to the high energy concentration degree of the voice of each data point in the spectrogram corresponding to each array microphone is as follows:

6. The method for locating forest gunsound based on voice analysis according to claim 1, wherein the method for calculating the concentrated coverage rate of the high-energy frequency band according to the high-energy frequency band and the low-energy frequency band of each column data point in the spectrogram corresponding to each array microphone is as follows:

7. The method for locating forest gunshot based on voice analysis according to claim 6, wherein the specific method for calculating the high energy distribution concentration according to the spectrum band energy concentration, the band energy difference coefficient and the band distribution characteristic of each column of data points in the spectrogram corresponding to each array microphone is as follows:

8. The method for positioning forest gunshot based on voice analysis according to claim 6, wherein the method for obtaining the concentrated coverage rate of the high-energy frequency band according to the concentrated degree of the high-energy distribution and the frequency bandwidth of the high-energy frequency band of each column data point in the spectrogram corresponding to each array microphone is as follows:

9. The method for locating forest gunsound based on voice analysis according to claim 1, wherein the method for calculating the suspicion of the gunsound mach wave based on the concentrated coverage rate of the high energy frequency band corresponding to each column data point in the spectrogram corresponding to each array element microphone is as follows:

10. The method for positioning the gunshot in the forest based on the voice analysis according to claim 1, wherein the method for acquiring the mach wave data matrix corresponding to each sound signal acquisition point in the forest according to the gun wave mach wave suspected degree of all columns in the spectrogram corresponding to each array element microphone and acquiring the direction angle of the gun in the forest based on the mach wave data matrix is as follows: