CN114863943B - Self-adaptive positioning method and device for environmental noise source based on beam forming - Google Patents

Self-adaptive positioning method and device for environmental noise source based on beam forming Download PDF

Info

Publication number
CN114863943B
CN114863943B CN202210778085.0A CN202210778085A CN114863943B CN 114863943 B CN114863943 B CN 114863943B CN 202210778085 A CN202210778085 A CN 202210778085A CN 114863943 B CN114863943 B CN 114863943B
Authority
CN
China
Prior art keywords
noise
voiceprint
recording
target
environment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210778085.0A
Other languages
Chinese (zh)
Other versions
CN114863943A (en
Inventor
曹祖杨
周航
侯佩佩
张鑫
李佳罗
闫昱甫
洪全付
陶慧芳
方吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Crysound Electronics Co Ltd
Original Assignee
Hangzhou Crysound Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Crysound Electronics Co Ltd filed Critical Hangzhou Crysound Electronics Co Ltd
Priority to CN202210778085.0A priority Critical patent/CN114863943B/en
Publication of CN114863943A publication Critical patent/CN114863943A/en
Application granted granted Critical
Publication of CN114863943B publication Critical patent/CN114863943B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S1/00Beacons or beacon systems transmitting signals having a characteristic or characteristics capable of being detected by non-directional receivers and defining directions, positions, or position lines fixed relatively to the beacon transmitters; Receivers co-operating therewith
    • G01S1/72Beacons or beacon systems transmitting signals having a characteristic or characteristics capable of being detected by non-directional receivers and defining directions, positions, or position lines fixed relatively to the beacon transmitters; Receivers co-operating therewith using ultrasonic, sonic or infrasonic waves
    • G01S1/76Systems for determining direction or position line
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Abstract

The invention discloses a method and a device for self-adaptive positioning of environmental noise sources based on beam forming, wherein the method comprises the steps of acquiring an environmental recording acquired by a microphone array in a target environment, and extracting a voiceprint plum-spectrum feature of the environmental recording based on a mel-frequency cepstrum; determining the noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm, and selecting all noise characteristics corresponding to the noise source type from a preset noise database; and carrying out beam forming positioning on the environment recording based on each noise characteristic to obtain a target noise position. This application has realized through the voiceprint plum blossom register feature who draws the environmental recording to after this determines the noise type according to the KNN algorithm, come the reverse screening to the environmental recording based on the noise characteristic of noise type, and then beam forming fixes a position the noise source position, can make sound source positioning system can carry out the accurate positioning voluntarily when all kinds of noise sources produce, and the location precision is high.

Description

Self-adaptive positioning method and device for environmental noise source based on beam forming
Technical Field
The present invention relates to the field of sound source localization technologies, and in particular, to a method and an apparatus for adaptively locating an ambient noise source based on beamforming.
Background
With the development of urban construction, more and more facilities and more noise are generated in urban areas, and noise pollution caused by environmental noise needs to be monitored. At present, the difficulty of law enforcement and evidence collection exists in the monitoring of environmental noise, because the noise of a monitoring area exceeds the standard, a plurality of noise manufacturing units are possible near a monitoring point, and the noise source cannot be positioned from the recording, so that effective supervision cannot be carried out. For a noise source with fixed frequency, the traditional wave velocity forming method can position to a certain extent. However, in environmental monitoring, the noise sources are various, the frequency is relatively wide, and the frequency is relatively wide, so that the traditional wave velocity forming method cannot achieve automatic positioning. In summary, no method capable of accurately positioning the overproof noise source in the environmental monitoring exists at present.
Disclosure of Invention
In order to solve the above problem, embodiments of the present application provide a method and an apparatus for adaptive positioning of environmental noise sources based on beamforming.
In a first aspect, an embodiment of the present application provides a beamforming-based adaptive positioning method for environmental noise sources, where the method includes:
acquiring an environment recording acquired by a microphone array in a target environment, and extracting a voiceprint Mei spectral feature of the environment recording based on a Meier cepstrum;
determining the noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm, and selecting all noise characteristics corresponding to the noise source type from a preset noise database;
and carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
Preferably, the extracting the voiceprint mei spectral features of the environmental recording based on the mel-frequency cepstrum includes:
calculating a fast Fourier transform spectrum corresponding to the environment recording;
and introducing the fast Fourier transform frequency spectrum into a Mel frequency filter to obtain the voiceprint Mel spectrum characteristics.
Preferably, the determining the noise source type corresponding to each voiceprint plum spectrum feature based on the KNN algorithm includes:
obtaining a classified sample voiceprint feature plane corresponding to a preset noise database, and mapping each voiceprint plum spectrum feature to the classified sample voiceprint feature plane;
calculating the distance between the mapped voiceprint plum spectrum features and the voiceprint features of the samples in the classified sample voiceprint feature plane, and selecting a first preset number of first sample voiceprint features according to the sequence of the distance from small to large;
and counting the number of each noise source type in the first sample voiceprint characteristic, and determining the noise source type with the maximum number as the noise source type corresponding to the voiceprint plum spectrum characteristic.
Preferably, the performing beamforming positioning on the environment sound recording based on each of the noise features to obtain a target noise position includes:
filtering the environment recording based on each noise feature, and removing sound features which are not matched with each noise feature in the environment recording to obtain noise recording;
and carrying out beam forming positioning on the noise recording to obtain a target noise position.
Preferably, the performing beamforming positioning on the noise recording to obtain a target noise position includes:
calculating a first relative reception delay between microphones of the array of microphones for the noise recording based on a cross-correlation method;
determining a target plane corresponding to the noise recording, and dividing the target plane into a second preset number of area positions;
simulating a second relative reception delay between each of the microphones for the zone location, respectively;
and determining a target second relative receiving delay with the minimum difference with the first relative receiving delay, wherein the target area position corresponding to the target second relative receiving delay is the target noise position.
Preferably, the method further comprises:
and acquiring an environment image acquired by the monitoring dome camera in the target environment, and generating sound image evidence information of the noise source by combining the environment image and the second noise position.
In a second aspect, an embodiment of the present application provides a beamforming-based adaptive positioning apparatus for environmental noise source, where the apparatus includes:
the acquisition module is used for acquiring an environment recording acquired by a microphone array in a target environment and extracting a voiceprint mei-spectrum feature of the environment recording based on a mel cepstrum;
the selection module is used for determining the noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm and selecting all noise characteristics corresponding to the noise source type from a preset noise database;
and the positioning module is used for carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method as provided in the first aspect or any one of the possible implementation manners of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method as provided in the first aspect or any one of the possible implementations of the first aspect.
The beneficial effects of the invention are as follows: through the voiceprint plum-blossom-shaped spectrum characteristic of the extracted environment recording, after the noise type is determined according to the KNN algorithm, the environment recording is reversely screened based on the noise characteristic of the noise type, and then the position of a noise source is positioned by beam forming, so that the sound source positioning system can automatically perform accurate positioning when various noise sources are generated, and the positioning accuracy is high.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the embodiments will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a beamforming-based adaptive positioning method for environmental noise source according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of an adaptive positioning apparatus for environmental noise source based on beamforming according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
In the following description, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The following description provides embodiments of the present application, where different embodiments may be substituted or combined, and thus the present application is intended to include all possible combinations of the same and/or different embodiments described. Thus, if one embodiment includes the feature A, B, C and another embodiment includes the feature B, D, then this application should also be considered to include embodiments that include all other possible combinations of one or more of A, B, C, D, although this embodiment may not be explicitly recited in text below.
The following description provides examples, and does not limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements described without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as appropriate. For example, the described methods may be performed in an order different than the order described, and various steps may be added, omitted, or combined. Furthermore, features described with respect to some examples may be combined into other examples.
Referring to fig. 1, fig. 1 is a schematic flowchart of a beamforming-based adaptive positioning method for environmental noise sources according to an embodiment of the present application. In an embodiment of the present application, the method includes:
s101, acquiring an environment recording acquired by a microphone array in a target environment, and extracting a voiceprint mei-spectrum feature of the environment recording based on a mel-frequency cepstrum.
The execution main body can be a cloud server of the sound source positioning system.
In the embodiment of the application, the cloud server firstly acquires the environment recording acquired by the microphone array in the target environment needing noise monitoring, and the microphone array can be a spherical microphone array. After the environmental recording is acquired, the cloud server extracts the voiceprint mei-spectral features from the environmental recording in a mel-frequency cepstrum mode so as to identify and determine noise in the follow-up process.
In one possible implementation, the extracting the voiceprint mei spectral features of the environmental recording based on the mel-frequency cepstrum includes:
calculating a fast Fourier transform spectrum corresponding to the environment sound recording;
and introducing the fast Fourier transform frequency spectrum into a Mel frequency filter to obtain the voiceprint Mel spectrum characteristics.
In the embodiment of the present application, in order to obtain the voiceprint mei spectrum feature from the environmental recording, firstly, a fast fourier transform spectrum, that is, an FFT spectrum, in the environmental recording needs to be calculated, and a calculation formula thereof is as follows:
Figure 75909DEST_PATH_IMAGE001
wherein x (N) is a discrete signal with finite length, namely an environmental recording, N =0,1, …, N-1;
Figure DEST_PATH_IMAGE002
is cos (2)
Figure 976738DEST_PATH_IMAGE003
kn/N)+isin(2
Figure 225316DEST_PATH_IMAGE003
kn/N)) euler equation;
Figure DEST_PATH_IMAGE004
is a complex spectrum data composed of the amplitude and phase of k/N periodic signal.
After the fast Fourier transform spectrum is calculated, the fast Fourier transform spectrum passes through a Mel frequency filter, so that the voiceprint plum spectrum characteristic can be obtained, and as the Mel filter coefficient of the Mel frequency filter is determined, the voiceprint plum spectrum characteristic obtained in the process can also be calculated and expressed by the following formula:
Figure 89367DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE006
in order to be able to perform the FFT on the spectrum data,
Figure 816322DEST_PATH_IMAGE007
is the Mel filter coefficient, and S is the voiceprint Mei spectral feature.
S102, determining the noise source type corresponding to each voiceprint plum spectrum feature based on a KNN algorithm, and selecting all noise features corresponding to the noise source type from a preset noise database.
In the embodiment of the application, the obtained voiceprint plum spectral features are classified through a KNN algorithm, and the noise source type corresponding to each voiceprint plum spectral feature is determined. After the noise source type is determined, the noise position in the environmental recording is not directly determined, but all the noise characteristics corresponding to the obtained noise source type are determined in a preset noise database, so that the environmental recording is subjected to reverse screening according to the noise characteristics, the omission of the noise characteristics is avoided, and the accuracy of final positioning is ensured.
In an embodiment, the determining, based on the KNN algorithm, a noise source class corresponding to each of the voiceprint mei spectral features includes:
obtaining a classified sample voiceprint feature plane corresponding to a preset noise database, and mapping each voiceprint plum spectrum feature to the classified sample voiceprint feature plane;
calculating the distance between the mapped voiceprint plum spectrum features and the voiceprint features of the samples in the classified sample voiceprint feature plane, and selecting a first preset number of first sample voiceprint features according to the sequence of the distance from small to large;
and counting the number of each noise source type in the first sample voiceprint characteristic, and determining the noise source type with the maximum number as the noise source type corresponding to the voiceprint plum spectrum characteristic.
In the embodiment of the present application, the specific calculation process of the KNN algorithm is to map the calculated voiceprint features to a voiceprint feature plane of classified samples in a noise library, calculate the distance between the current voiceprint features and the voiceprint features of various samples on the mapped feature plane, count each sample type for the first K samples with the smallest distance, and finally obtain the noise type of the signal as the type with the largest count. Illustratively, when K =5, the five noise library sample types closest to the signal on the feature plane are [ 1-industrial noise, 2-industrial noise, 3-human voice, 4-vehicle noise, 5-industrial noise ], respectively, and the signal is classified as industrial noise.
S103, carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
In the embodiment of the application, after the noise characteristics of the noise source type in the environmental recording are determined, the environmental recording is reversely screened through the noise characteristics, and the beamforming positioning is performed after the screening is completed, so that the position of the target noise corresponding to the noise source generating the overproof noise in the environmental recording is determined.
In one possible embodiment, step S103 includes:
filtering the environmental recording based on each noise feature, and removing the sound features which are not matched with each noise feature in the environmental recording to obtain a noise recording;
and carrying out beam forming positioning on the noise record to obtain a target noise position.
In the embodiment of the application, the environmental recording is filtered according to the obtained noise characteristics so as to screen out the sound characteristics which can be matched with the noise characteristics in the environmental recording, and the rest unmatched sound characteristics are removed, so that the noise recording in the environmental recording is obtained. The target noise location is then obtained by beamforming the noise recording. Compared with the traditional mode, the method and the device have the advantages that the characteristics corresponding to the noise are not directly determined through the environment recording, the environment recording is screened out based on the noise characteristics corresponding to the noise source of the type after the type of the noise source is determined through the environment recording, so that all the noise characteristics in the environment recording can be screened out, and the situation that some characteristics are not identified or omitted and the positioning is inaccurate or cannot be positioned in the traditional mode of directly determining the characteristics can be avoided.
In one embodiment, the performing beamforming positioning on the noise record to obtain a target noise position includes:
calculating a first relative reception delay between microphones of the microphone array for the noise recording based on a cross-correlation method;
determining a target plane corresponding to the noise recording, and dividing the target plane into a second preset number of area positions;
simulating a second relative reception delay between each of the microphones for the zone location, respectively;
and determining a target second relative receiving delay with the minimum difference with the first relative receiving delay, wherein the target area position corresponding to the target second relative receiving delay is the target noise position.
In the embodiment of the present application, the specific process of the beam forming is that, since the microphone array is composed of a plurality of microphones, for a sound feature transmitted from a certain position, there is a delay in the time when the sound feature is received by each microphone, first, a first relative receiving delay between each microphone with respect to a noise recording is calculated by a cross-correlation method. Specifically, the calculation formula is as follows:
Figure DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 373205DEST_PATH_IMAGE009
and
Figure DEST_PATH_IMAGE010
for the time series of the signals received by the microphone,
Figure 312342DEST_PATH_IMAGE011
in order to be able to displace the time,
Figure DEST_PATH_IMAGE012
is a cross correlation series.
At the maximum of the cross-correlation sequence, the two time sequences are aligned, the index value of the maximum is multiplied by
Figure 494931DEST_PATH_IMAGE011
I.e. the delay found by the cross-correlation.
Furthermore, based on the orientation of the noise recording, it is possible to select a target plane and to segment the region for the target plane (for example, a visual plane located at the front of 10m × 10m, and 10 regions segmented into 1m × 1m), thereby sequentially calculating a second relative reception delay corresponding to the microphones when the sound is transmitted from the region, also based on the cross correlation method. Finally, the first relative receiving delay is compared with each second relative receiving delay, and when the difference between the first relative receiving delay and each second relative receiving delay is smaller, the positions of the first relative receiving delay and each second relative receiving delay are closer, so that the position of the target area corresponding to the target second relative receiving delay with the smallest difference is determined as the position of the target noise.
In one embodiment, the method further comprises:
and acquiring an environment image acquired by the monitoring dome camera in the target environment, and generating sound image evidence information of the noise source by combining the environment image and the second noise position.
In the embodiment of the application, besides the microphone array, the target environment is also provided with a monitoring ball machine to acquire environment image information, and the determined target noise position and the environment image are combined for drawing, so that a building from which a noise source is specifically transmitted can be determined, and then noise source sound image evidence information is generated and obtained, so that follow-up and management and control can be performed subsequently.
The beamforming-based adaptive positioning apparatus for environmental noise source provided by the embodiment of the present application will be described in detail below with reference to fig. 2. It should be noted that, the environmental noise source adaptive positioning apparatus based on beamforming shown in fig. 2 is used for executing the method of the embodiment shown in fig. 1 of the present application, and for convenience of description, only the portion related to the embodiment of the present application is shown, and details of the specific technology are not disclosed, please refer to the embodiment shown in fig. 1 of the present application.
Referring to fig. 2, fig. 2 is a schematic structural diagram of an adaptive positioning apparatus for environmental noise source based on beamforming according to an embodiment of the present application. As shown in fig. 2, the apparatus includes:
the acquisition module 201 is configured to acquire an environmental recording acquired by a microphone array in a target environment, and extract a voiceprint mei-spectral feature of the environmental recording based on a mel-frequency cepstrum;
a selecting module 202, configured to determine a noise source type corresponding to each voiceprint plum spectrum feature based on a KNN algorithm, and select all noise features corresponding to the noise source type from a preset noise database;
and the positioning module 203 is configured to perform beamforming positioning on the environment recording based on each noise feature to obtain a target noise position.
In one implementation, the obtaining module 201 includes:
the first calculation unit is used for calculating a fast Fourier transform spectrum corresponding to the environment sound recording;
and the second calculating unit is used for introducing the fast Fourier transform frequency spectrum into a Mel frequency filter to obtain the voiceprint plum spectrum characteristic.
In one possible implementation, the selecting module 202 includes:
the first acquisition unit is used for acquiring a classified sample voiceprint feature plane corresponding to the preset noise database and mapping each voiceprint plum spectrum feature to the classified sample voiceprint feature plane;
the third calculating unit is used for calculating the distance between the mapped voiceprint plum spectrum features and the voiceprint features of the samples in the classified sample voiceprint feature plane, and selecting a first preset number of first sample voiceprint features according to the sequence of the distance from small to large;
and the statistical unit is used for counting the number of each noise source type in the voiceprint characteristics of the first sample, and determining the noise source type with the maximum number as the noise source type corresponding to the voiceprint plum spectrum characteristics.
In one possible implementation, the positioning module 203 includes:
the screening unit is used for filtering the environment recording based on each noise characteristic, removing the sound characteristic which is not matched with each noise characteristic in the environment recording, and obtaining the noise recording;
and the positioning unit is used for carrying out beam forming positioning on the noise record to obtain a target noise position.
In one embodiment, the positioning unit comprises:
a computing element for computing a first relative reception delay for the noise recording between microphones of the microphone array based on a cross-correlation method;
the first determining element is used for determining a target plane corresponding to the noise recording and dividing the target plane into a second preset number of area positions;
an analog element for respectively simulating a second relative reception delay between each of the microphones for the zone location;
and a second determining element, configured to determine a target second relative receiving delay with a smallest difference from the first relative receiving delay, where a target region position corresponding to the target second relative receiving delay is a target noise position.
In one embodiment, the apparatus further comprises:
and the combination module is used for acquiring an environment image acquired by the monitoring dome camera in the target environment and generating noise source acoustic image evidence information by combining the environment image and the second noise position.
It is clear to a person skilled in the art that the solution according to the embodiments of the present application can be implemented by means of software and/or hardware. The "unit" and "module" in this specification refer to software and/or hardware that can perform a specific function independently or in cooperation with other components, where the hardware may be, for example, a Field-Programmable Gate Array (FPGA), an Integrated Circuit (IC), or the like.
Each processing unit and/or module in the embodiments of the present application may be implemented by an analog circuit that implements the functions described in the embodiments of the present application, or may be implemented by software that executes the functions described in the embodiments of the present application.
Referring to fig. 3, a schematic structural diagram of an electronic device according to an embodiment of the present application is shown, where the electronic device may be used to implement the method in the embodiment shown in fig. 1. As shown in fig. 3, the electronic device 300 may include: at least one central processor 301, at least one network interface 304, a user interface 303, a memory 305, at least one communication bus 302.
Wherein a communication bus 302 is used to enable the connection communication between these components.
The user interface 303 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 303 may further include a standard wired interface and a wireless interface.
The network interface 304 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
The central processor 301 may include one or more processing cores. The central processor 301 connects various parts within the entire electronic device 300 using various interfaces and lines, and performs various functions of the terminal 300 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 305 and calling data stored in the memory 305. Alternatively, the central Processing unit 301 may be implemented in at least one hardware form of Digital Signal Processing (DSP), field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The CPU 301 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the cpu 301, but may be implemented by a single chip.
The Memory 305 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 305 includes a non-transitory computer-readable medium. The memory 305 may be used to store instructions, programs, code sets, or instruction sets. The memory 305 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 305 may alternatively be at least one storage device located remotely from the central processor 301. As shown in fig. 3, memory 305, which is a type of computer storage medium, may include an operating system, a network communication module, a user interface module, and program instructions.
In the electronic device 300 shown in fig. 3, the user interface 303 is mainly used for providing an input interface for a user to obtain data input by the user; and the central processor 301 may be configured to invoke the beamforming-based ambient noise source adaptive positioning application stored in the memory 305 and specifically perform the following operations:
acquiring an environment recording acquired by a microphone array in a target environment, and extracting a voiceprint plum spectrum characteristic of the environment recording based on a mel-frequency cepstrum;
determining a noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm, and selecting all noise characteristics corresponding to the noise source type from a preset noise database;
and carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method. The computer-readable storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, DVDs, CD-ROMs, microdrive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs), or any type of media or device suitable for storing instructions and/or data.
It should be noted that for simplicity of description, the above-mentioned embodiments of the method are described as a series of acts, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some service interfaces, devices or units, and may be an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable memory. Based on such understanding, the technical solutions of the present application, in essence or part of the technical solutions contributing to the prior art, or all or part of the technical solutions, can be embodied in the form of a software product, which is stored in a memory and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program, which is stored in a computer-readable memory, and the memory may include: flash disks, read-Only memories (ROMs), random Access Memories (RAMs), magnetic or optical disks, and the like.
The above description is merely an exemplary embodiment of the present disclosure, and the scope of the present disclosure is not limited thereto. That is, all equivalent changes and modifications made in accordance with the teachings of the present disclosure are intended to be included within the scope of the present disclosure. Embodiments of the present disclosure will be readily apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (8)

1. A beamforming-based adaptive positioning method for environmental noise sources, the method comprising:
acquiring an environment recording acquired by a microphone array in a target environment, and extracting a voiceprint plum spectrum characteristic of the environment recording based on a mel-frequency cepstrum;
determining a noise source type corresponding to each voiceprint plum spectrum feature based on a KNN algorithm, and selecting all noise features corresponding to the noise source type from a preset noise database, wherein the noise features are used for carrying out reverse screening on the environmental recording;
performing beam forming positioning on the environment recording based on each noise characteristic to obtain a target noise position;
the performing beamforming positioning on the environmental recording based on each of the noise characteristics to obtain a target noise position includes:
filtering the environment recording based on each noise feature, and removing sound features which are not matched with each noise feature in the environment recording to obtain noise recording;
and carrying out beam forming positioning on the noise record to obtain a target noise position.
2. The method of claim 1, wherein the extracting the voiceprint mei spectral features of the environmental audio recording based on the mel-frequency cepstrum comprises:
calculating a fast Fourier transform spectrum corresponding to the environment sound recording;
and introducing the fast Fourier transform frequency spectrum into a Mel frequency filter to obtain the voiceprint Mel spectrum characteristics.
3. The method according to claim 1, wherein the determining a noise source class corresponding to each voiceprint plum spectral feature based on a KNN algorithm comprises:
obtaining a classified sample voiceprint feature plane corresponding to a preset noise database, and mapping each voiceprint plum spectrum feature to the classified sample voiceprint feature plane;
calculating the distance between the mapped voiceprint plum spectrum features and the voiceprint features of the samples in the classified voiceprint feature plane, and selecting a first preset number of first sample voiceprint features according to the sequence of the distances from small to large;
and counting the number of each noise source type in the first sample voiceprint characteristic, and determining the noise source type with the maximum number as the noise source type corresponding to the voiceprint plum spectrum characteristic.
4. The method of claim 1, wherein the beamforming positioning the noise recording to obtain a target noise position comprises:
calculating a first relative reception delay between microphones of the microphone array for the noise recording based on a cross-correlation method;
determining a target plane corresponding to the noise recording, and dividing the target plane into a second preset number of area positions;
simulating a second relative reception delay between each of the microphones for the zone location, respectively;
and determining a target second relative receiving delay with the minimum difference with the first relative receiving delay, wherein the target area position corresponding to the target second relative receiving delay is the target noise position.
5. The method of claim 1, further comprising:
and acquiring an environment image acquired by the monitoring dome camera in the target environment, and generating sound image evidence information of the noise source by combining the environment image and the second noise position.
6. An apparatus for adaptive positioning of ambient noise sources based on beamforming, the apparatus comprising:
the acquisition module is used for acquiring an environment recording acquired by a microphone array in a target environment and extracting a voiceprint mei-spectrum feature of the environment recording based on a mel cepstrum;
the selecting module is used for determining the noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm, and selecting all noise characteristics corresponding to the noise source type from a preset noise database, wherein the noise characteristics are used for carrying out reverse screening on the environmental recording;
the positioning module is used for carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position;
the positioning module includes:
the screening unit is used for filtering the environment recording based on each noise characteristic, removing the sound characteristic which is not matched with each noise characteristic in the environment recording, and obtaining the noise recording;
and the positioning unit is used for carrying out beam forming positioning on the noise record to obtain a target noise position.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1-5 are implemented when the computer program is executed by the processor.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN202210778085.0A 2022-07-04 2022-07-04 Self-adaptive positioning method and device for environmental noise source based on beam forming Active CN114863943B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210778085.0A CN114863943B (en) 2022-07-04 2022-07-04 Self-adaptive positioning method and device for environmental noise source based on beam forming

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210778085.0A CN114863943B (en) 2022-07-04 2022-07-04 Self-adaptive positioning method and device for environmental noise source based on beam forming

Publications (2)

Publication Number Publication Date
CN114863943A CN114863943A (en) 2022-08-05
CN114863943B true CN114863943B (en) 2022-11-04

Family

ID=82625942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210778085.0A Active CN114863943B (en) 2022-07-04 2022-07-04 Self-adaptive positioning method and device for environmental noise source based on beam forming

Country Status (1)

Country Link
CN (1) CN114863943B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115547356B (en) * 2022-11-25 2023-03-10 杭州兆华电子股份有限公司 Wind noise processing method and system based on abnormal sound detection of unmanned aerial vehicle

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN206573209U (en) * 2017-01-25 2017-10-20 大连理工大学 A kind of Noise Sources Identification system based on Phase conjugation theory
CN107547981A (en) * 2017-05-17 2018-01-05 宁波桑德纳电子科技有限公司 A kind of audio collecting device, supervising device and collection sound method
CN108538320A (en) * 2018-03-30 2018-09-14 广东欧珀移动通信有限公司 Recording control method and device, readable storage medium storing program for executing, terminal
CN110767226A (en) * 2019-10-30 2020-02-07 山西见声科技有限公司 Sound source positioning method and device with high accuracy, voice recognition method and system, storage equipment and terminal
CN112463103A (en) * 2019-09-06 2021-03-09 北京声智科技有限公司 Sound pickup method, sound pickup device, electronic device and storage medium
CN113689873A (en) * 2021-09-07 2021-11-23 联想(北京)有限公司 Noise suppression method, device, electronic equipment and storage medium

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5581620A (en) * 1994-04-21 1996-12-03 Brown University Research Foundation Methods and apparatus for adaptive beamforming
EP2197219B1 (en) * 2008-12-12 2012-10-24 Nuance Communications, Inc. Method for determining a time delay for time delay compensation
US8861756B2 (en) * 2010-09-24 2014-10-14 LI Creative Technologies, Inc. Microphone array system
KR20120059827A (en) * 2010-12-01 2012-06-11 삼성전자주식회사 Apparatus for multiple sound source localization and method the same
US9215328B2 (en) * 2011-08-11 2015-12-15 Broadcom Corporation Beamforming apparatus and method based on long-term properties of sources of undesired noise affecting voice quality
KR101282673B1 (en) * 2011-12-09 2013-07-05 현대자동차주식회사 Method for Sound Source Localization
US9666175B2 (en) * 2015-07-01 2017-05-30 zPillow, Inc. Noise cancelation system and techniques
US10482899B2 (en) * 2016-08-01 2019-11-19 Apple Inc. Coordination of beamformers for noise estimation and noise suppression
CN108760034A (en) * 2018-05-21 2018-11-06 广西电网有限责任公司电力科学研究院 A kind of transformer vibration noise source positioning system and method
CN110691299B (en) * 2019-08-29 2020-12-11 科大讯飞(苏州)科技有限公司 Audio processing system, method, apparatus, device and storage medium
CN111175698B (en) * 2020-01-18 2022-12-20 国网山东省电力公司菏泽供电公司 Transformer noise source positioning method, system and device based on sound and vibration combination
DE102020207586A1 (en) * 2020-06-18 2021-12-23 Sivantos Pte. Ltd. Hearing system with at least one hearing instrument worn on the head of the user and a method for operating such a hearing system
CN111880148A (en) * 2020-08-07 2020-11-03 北京字节跳动网络技术有限公司 Sound source positioning method, device, equipment and storage medium
CN112098939B (en) * 2020-09-18 2021-09-24 广东电网有限责任公司电力科学研究院 Method and device for identifying and evaluating noise pollution source
CN112687294A (en) * 2020-12-21 2021-04-20 重庆科技学院 Vehicle-mounted noise identification method
CN112966560A (en) * 2021-02-03 2021-06-15 郑州大学 Electric spindle fault diagnosis method and device based on deconvolution imaging
CN114355290B (en) * 2022-03-22 2022-06-24 杭州兆华电子股份有限公司 Sound source three-dimensional imaging method and system based on stereo array
CN114509162B (en) * 2022-04-18 2022-06-21 四川三元环境治理股份有限公司 Sound environment data monitoring method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN206573209U (en) * 2017-01-25 2017-10-20 大连理工大学 A kind of Noise Sources Identification system based on Phase conjugation theory
CN107547981A (en) * 2017-05-17 2018-01-05 宁波桑德纳电子科技有限公司 A kind of audio collecting device, supervising device and collection sound method
CN108538320A (en) * 2018-03-30 2018-09-14 广东欧珀移动通信有限公司 Recording control method and device, readable storage medium storing program for executing, terminal
CN112463103A (en) * 2019-09-06 2021-03-09 北京声智科技有限公司 Sound pickup method, sound pickup device, electronic device and storage medium
CN110767226A (en) * 2019-10-30 2020-02-07 山西见声科技有限公司 Sound source positioning method and device with high accuracy, voice recognition method and system, storage equipment and terminal
CN113689873A (en) * 2021-09-07 2021-11-23 联想(北京)有限公司 Noise suppression method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114863943A (en) 2022-08-05

Similar Documents

Publication Publication Date Title
US11657798B2 (en) Methods and apparatus to segment audio and determine audio segment similarities
CN110875060A (en) Voice signal processing method, device, system, equipment and storage medium
CN109637525B (en) Method and apparatus for generating an on-board acoustic model
CN114863943B (en) Self-adaptive positioning method and device for environmental noise source based on beam forming
US20140278415A1 (en) Voice Recognition Configuration Selector and Method of Operation Therefor
CN111868823A (en) Sound source separation method, device and equipment
CN108877783A (en) The method and apparatus for determining the audio types of audio data
CN111077496A (en) Voice processing method and device based on microphone array and terminal equipment
CN111540375A (en) Training method of audio separation model, and audio signal separation method and device
CN114333881B (en) Audio transmission noise reduction method, device and medium based on environment self-adaptation
AU2022275486A1 (en) Methods and apparatus to fingerprint an audio signal via normalization
CN108387757B (en) Method and apparatus for detecting moving state of movable device
CN113259832A (en) Microphone array detection method and device, electronic equipment and storage medium
CN112382302A (en) Baby cry identification method and terminal equipment
CN113327628A (en) Audio processing method and device, readable medium and electronic equipment
US20220212108A1 (en) Audio frequency signal processing method and apparatus, terminal and storage medium
CN111400511A (en) Multimedia resource interception method and device
CN111199749A (en) Behavior recognition method, behavior recognition apparatus, machine learning method, machine learning apparatus, and recording medium
CN114944152A (en) Vehicle whistling sound identification method
CN114415115B (en) Target signal frequency automatic optimization method for assisting direction of arrival positioning
CN114020192B (en) Interaction method and system for realizing nonmetal plane based on curved surface capacitor
CN117789755A (en) Audio data detection method and device and electronic equipment
CN115798520A (en) Voice detection method and device, electronic equipment and storage medium
CN115910107A (en) Audio data detection method, computer and readable storage medium
CN116129915A (en) Identity recognition method, voice quality inspection method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant