CN114863943A - Self-adaptive positioning method and device for environmental noise source based on beam forming - Google Patents

Self-adaptive positioning method and device for environmental noise source based on beam forming Download PDF

Info

Publication number
CN114863943A
CN114863943A CN202210778085.0A CN202210778085A CN114863943A CN 114863943 A CN114863943 A CN 114863943A CN 202210778085 A CN202210778085 A CN 202210778085A CN 114863943 A CN114863943 A CN 114863943A
Authority
CN
China
Prior art keywords
noise
voiceprint
target
recording
environment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210778085.0A
Other languages
Chinese (zh)
Other versions
CN114863943B (en
Inventor
曹祖杨
周航
侯佩佩
张鑫
李佳罗
闫昱甫
洪全付
陶慧芳
方吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Crysound Electronics Co Ltd
Original Assignee
Hangzhou Crysound Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Crysound Electronics Co Ltd filed Critical Hangzhou Crysound Electronics Co Ltd
Priority to CN202210778085.0A priority Critical patent/CN114863943B/en
Publication of CN114863943A publication Critical patent/CN114863943A/en
Application granted granted Critical
Publication of CN114863943B publication Critical patent/CN114863943B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S1/00Beacons or beacon systems transmitting signals having a characteristic or characteristics capable of being detected by non-directional receivers and defining directions, positions, or position lines fixed relatively to the beacon transmitters; Receivers co-operating therewith
    • G01S1/72Beacons or beacon systems transmitting signals having a characteristic or characteristics capable of being detected by non-directional receivers and defining directions, positions, or position lines fixed relatively to the beacon transmitters; Receivers co-operating therewith using ultrasonic, sonic or infrasonic waves
    • G01S1/76Systems for determining direction or position line
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Abstract

The invention discloses a method and a device for self-adaptive positioning of environmental noise sources based on beam forming, wherein the method comprises the steps of acquiring an environmental recording acquired by a microphone array in a target environment, and extracting a voiceprint plum-spectrum feature of the environmental recording based on a mel-frequency cepstrum; determining the noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm, and selecting all noise characteristics corresponding to the noise source type from a preset noise database; and carrying out beam forming positioning on the environment recording based on each noise characteristic to obtain a target noise position. This application has realized through the voiceprint plum blossom register feature who draws the environmental recording to after this determines the noise type according to the KNN algorithm, come the reverse screening to the environmental recording based on the noise characteristic of noise type, and then beam forming fixes a position the noise source position, can make sound source positioning system can carry out the accurate positioning voluntarily when all kinds of noise sources produce, and the location precision is high.

Description

Self-adaptive positioning method and device for environmental noise source based on beam forming
Technical Field
The present invention relates to the field of sound source localization technologies, and in particular, to a method and an apparatus for adaptively locating an ambient noise source based on beamforming.
Background
With the development of urban construction, more and more facilities and more noise are generated in urban areas, and noise pollution caused by environmental noise needs to be monitored. At present, the difficulty of law enforcement and evidence collection exists in the monitoring of environmental noise, because the noise of a monitoring area exceeds the standard, a plurality of noise manufacturing units are possible near a monitoring point, and the noise source cannot be positioned from the recording, so that effective supervision cannot be carried out. For a noise source with fixed frequency, the traditional wave velocity forming method can position to a certain extent. However, in environmental monitoring, the noise sources are various, the frequency is relatively wide, and the frequency is relatively wide, so that the traditional wave velocity forming method cannot achieve automatic positioning. In summary, no method capable of accurately positioning the overproof noise source in the environmental monitoring exists at present.
Disclosure of Invention
In order to solve the above problem, embodiments of the present application provide a method and an apparatus for adaptive positioning of environmental noise sources based on beamforming.
In a first aspect, an embodiment of the present application provides a beamforming-based adaptive positioning method for environmental noise sources, where the method includes:
acquiring an environment recording acquired by a microphone array in a target environment, and extracting a voiceprint plum spectrum characteristic of the environment recording based on a mel-frequency cepstrum;
determining a noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm, and selecting all noise characteristics corresponding to the noise source type from a preset noise database;
and carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
Preferably, the extracting the voiceprint mei spectral features of the environmental recording based on the mel-frequency cepstrum includes:
calculating a fast Fourier transform spectrum corresponding to the environment sound recording;
and introducing the fast Fourier transform frequency spectrum into a Mel frequency filter to obtain the voiceprint Mel spectrum characteristics.
Preferably, the determining the noise source type corresponding to each voiceprint plum spectrum feature based on the KNN algorithm includes:
obtaining a classified sample voiceprint feature plane corresponding to a preset noise database, and mapping each voiceprint plum spectrum feature to the classified sample voiceprint feature plane;
calculating the distance between the mapped voiceprint plum spectrum features and the voiceprint features of the samples in the classified sample voiceprint feature plane, and selecting a first preset number of first sample voiceprint features according to the sequence of the distance from small to large;
and counting the number of each noise source type in the first sample voiceprint characteristic, and determining the noise source type with the maximum number as the noise source type corresponding to the voiceprint plum spectrum characteristic.
Preferably, the performing beamforming positioning on the environment sound recording based on each of the noise features to obtain a target noise position includes:
filtering the environment recording based on each noise feature, and removing sound features which are not matched with each noise feature in the environment recording to obtain noise recording;
and carrying out beam forming positioning on the noise record to obtain a target noise position.
Preferably, the performing beamforming positioning on the noise recording to obtain a target noise position includes:
calculating a first relative reception delay between microphones of the microphone array for the noise recording based on a cross-correlation method;
determining a target plane corresponding to the noise recording, and dividing the target plane into a second preset number of area positions;
simulating a second relative reception delay between each of the microphones for the zone location, respectively;
and determining a target second relative receiving delay with the minimum difference with the first relative receiving delay, wherein the target area position corresponding to the target second relative receiving delay is the target noise position.
Preferably, the method further comprises:
and acquiring an environment image acquired by the monitoring dome camera in the target environment, and generating sound image evidence information of the noise source by combining the environment image and the second noise position.
In a second aspect, an embodiment of the present application provides a beamforming-based adaptive positioning apparatus for environmental noise source, where the apparatus includes:
the acquisition module is used for acquiring an environment recording acquired by a microphone array in a target environment and extracting a voiceprint mei-spectrum feature of the environment recording based on a mel cepstrum;
the selection module is used for determining the noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm and selecting all noise characteristics corresponding to the noise source type from a preset noise database;
and the positioning module is used for carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method as provided in the first aspect or any one of the possible implementation manners of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method as provided in the first aspect or any one of the possible implementations of the first aspect.
The beneficial effects of the invention are as follows: through the voiceprint plum-blossom-shaped spectrum characteristic of the extracted environment recording, after the noise type is determined according to the KNN algorithm, the environment recording is reversely screened based on the noise characteristic of the noise type, and then the position of a noise source is positioned by beam forming, so that the sound source positioning system can automatically perform accurate positioning when various noise sources are generated, and the positioning accuracy is high.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flowchart of a beamforming-based adaptive positioning method for environmental noise source according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of an adaptive positioning apparatus for environmental noise source based on beamforming according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
In the following description, the terms "first" and "second" are used for descriptive purposes only and are not intended to indicate or imply relative importance. The following description provides embodiments of the present application, which may be combined or interchanged with one another, and therefore the present application is also to be construed as encompassing all possible combinations of the same and/or different embodiments described. Thus, if one embodiment includes feature A, B, C and another embodiment includes feature B, D, then this application should also be construed to include embodiments that include one or more of all other possible combinations of A, B, C, D, even though such embodiments may not be explicitly recited in the text below.
The following description provides examples, and does not limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements described without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as appropriate. For example, the described methods may be performed in an order different than the order described, and various steps may be added, omitted, or combined. Furthermore, features described with respect to some examples may be combined into other examples.
Referring to fig. 1, fig. 1 is a schematic flowchart of a beamforming-based adaptive positioning method for environmental noise sources according to an embodiment of the present application. In an embodiment of the present application, the method includes:
s101, acquiring an environment recording acquired by a microphone array in a target environment, and extracting a voiceprint mei-spectrum feature of the environment recording based on a mel-frequency cepstrum.
The execution main body can be a cloud server of a sound source positioning system.
In the embodiment of the application, the cloud server firstly acquires the environment recording acquired by the microphone array in the target environment needing noise monitoring, and the microphone array can be a spherical microphone array. After the environmental recording is collected, the cloud server extracts the voiceprint Mei spectral features from the environmental recording in a Meier cepstrum mode so as to identify and determine the noise subsequently.
In one possible implementation, the extracting the voiceprint mei spectral features of the environmental recording based on the mel-frequency cepstrum includes:
calculating a fast Fourier transform spectrum corresponding to the environment sound recording;
and introducing the fast Fourier transform frequency spectrum into a Mel frequency filter to obtain the voiceprint Mel spectrum characteristics.
In the embodiment of the present application, in order to obtain the voiceprint mei spectrum feature from the environmental recording, firstly, a fast fourier transform spectrum, that is, an FFT spectrum, in the environmental recording needs to be calculated, and a calculation formula thereof is as follows:
Figure 75909DEST_PATH_IMAGE001
where x (N) is a finite length of discrete signal, i.e., an environmental recording, N =0, 1, …, N-1;
Figure DEST_PATH_IMAGE002
is cos (2)
Figure 976738DEST_PATH_IMAGE003
kn/N)+isin(2
Figure 225316DEST_PATH_IMAGE003
kn/N)) euler equation;
Figure DEST_PATH_IMAGE004
is a complex spectrum data composed of the amplitude and phase of k/N periodic signal.
After the fast Fourier transform spectrum is calculated, the fast Fourier transform spectrum passes through a Mel frequency filter, so that the voiceprint plum spectrum characteristic can be obtained, and as the Mel filter coefficient of the Mel frequency filter is determined, the voiceprint plum spectrum characteristic obtained in the process can also be calculated and expressed by the following formula:
Figure 89367DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE006
in order to be able to perform the FFT on the spectrum data,
Figure 816322DEST_PATH_IMAGE007
is the mel-filter coefficient and S is the voiceprint mei spectral feature.
S102, determining the noise source type corresponding to each voiceprint plum spectrum feature based on a KNN algorithm, and selecting all noise features corresponding to the noise source type from a preset noise database.
In the embodiment of the application, the obtained voiceprint plum spectral features are classified through a KNN algorithm, and the noise source type corresponding to each voiceprint plum spectral feature is determined. After the noise source type is determined, the noise position in the environmental recording is not directly determined, but all the noise characteristics corresponding to the obtained noise source type are determined in a preset noise database, so that the environmental recording is subjected to reverse screening according to the noise characteristics, the omission of the noise characteristics is avoided, and the accuracy of final positioning is ensured.
In an embodiment, the determining, based on the KNN algorithm, a noise source class corresponding to each of the voiceprint mei spectral features includes:
obtaining a classified sample voiceprint feature plane corresponding to a preset noise database, and mapping each voiceprint plum spectrum feature to the classified sample voiceprint feature plane;
calculating the distance between the mapped voiceprint plum spectrum features and the voiceprint features of the samples in the classified sample voiceprint feature plane, and selecting a first preset number of first sample voiceprint features according to the sequence of the distance from small to large;
and counting the number of each noise source type in the first sample voiceprint characteristic, and determining the noise source type with the maximum number as the noise source type corresponding to the voiceprint plum spectrum characteristic.
In the embodiment of the present application, the specific calculation process of the KNN algorithm is to map the calculated voiceprint features to a voiceprint feature plane of classified samples in a noise library, calculate the distance between the current voiceprint features and the voiceprint features of various samples on the mapped feature plane, count each sample type for the first K samples with the smallest distance, and finally obtain the noise type of the signal as the type with the largest count. Illustratively, when K =5, the five noise library sample types closest to the signal on the feature plane are [ 1-industrial noise, 2-industrial noise, 3-human voice, 4-vehicle noise, 5-industrial noise ], respectively, and the signal is classified as industrial noise.
S103, carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
In the embodiment of the application, after the noise characteristics of the noise source types in the environmental recording are determined, the environmental recording is reversely screened through the noise characteristics, beam forming and positioning are performed after screening is completed, and then the position of the target noise corresponding to the noise source generating the overproof noise in the environmental recording is determined.
In one possible embodiment, step S103 includes:
filtering the environment recording based on each noise feature, and removing sound features which are not matched with each noise feature in the environment recording to obtain noise recording;
and carrying out beam forming positioning on the noise record to obtain a target noise position.
In the embodiment of the application, the environmental recording is filtered according to the obtained noise characteristics so as to screen out the sound characteristics which can be matched with the noise characteristics in the environmental recording, and the rest unmatched sound characteristics are removed, so that the noise recording in the environmental recording is obtained. The target noise location is then obtained by beamforming the noise recording. Compared with the traditional mode, the method and the device have the advantages that the characteristics corresponding to the noise are not directly determined through the environment recording, the environment recording is screened out based on the noise characteristics corresponding to the noise source of the type after the type of the noise source is determined through the environment recording, so that all the noise characteristics in the environment recording can be screened out, and the situation that some characteristics are not identified or omitted and the positioning is inaccurate or cannot be positioned in the traditional mode of directly determining the characteristics can be avoided.
In one embodiment, the performing beamforming positioning on the noise record to obtain a target noise position includes:
calculating a first relative reception delay between microphones of the microphone array for the noise recording based on a cross-correlation method;
determining a target plane corresponding to the noise recording, and dividing the target plane into a second preset number of area positions;
simulating a second relative reception delay between each of the microphones for the zone location, respectively;
and determining a target second relative receiving delay with the minimum difference with the first relative receiving delay, wherein the target area position corresponding to the target second relative receiving delay is the target noise position.
In the embodiment of the present application, the specific process of beamforming is that, since the microphone array is composed of a plurality of microphones, for a sound feature transmitted from a certain position, there is a delay in the time when the sound feature is received between the microphones, so a first relative receiving delay between the microphones with respect to a noise recording is first calculated through a cross-correlation method. Specifically, the calculation formula is as follows:
Figure DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 373205DEST_PATH_IMAGE009
and
Figure DEST_PATH_IMAGE010
for a time series of signals received by the microphone,
Figure 312342DEST_PATH_IMAGE011
in order to be able to displace the time,
Figure DEST_PATH_IMAGE012
is a cross correlation series.
At the maximum in the cross-correlation sequence, the two time series are aligned, the index of the maximumValue multiplied by
Figure 494931DEST_PATH_IMAGE011
I.e. the delay found by the cross-correlation.
In addition, based on the orientation of the noise recording, a target plane can be selected and determined, and a region (for example, a visible plane 10m by 10m ahead, and 10 regions divided into 1m by 1 m) can be cut for the target plane, so that a second relative receiving delay corresponding to the microphones when the sound is transmitted from the region can be sequentially calculated according to the cross-correlation method. Finally, the first relative receiving delay is compared with each second relative receiving delay, and when the difference between the first relative receiving delay and each second relative receiving delay is smaller, the positions of the first relative receiving delay and each second relative receiving delay are closer, so that the position of the target area corresponding to the target second relative receiving delay with the smallest difference is determined as the position of the target noise.
In one embodiment, the method further comprises:
and acquiring an environment image acquired by the monitoring dome camera in the target environment, and generating sound image evidence information of the noise source by combining the environment image and the second noise position.
In the embodiment of the application, the target environment is provided with a monitoring ball machine to acquire environment image information besides the microphone array, and the determined target noise position and the environment image are combined for drawing, so that a specific building from which a noise source is transmitted can be determined, and noise source acoustic image evidence information is generated and acquired, so that follow-up and management and control can be performed later.
The beamforming-based adaptive positioning apparatus for environmental noise source provided by the embodiment of the present application will be described in detail below with reference to fig. 2. It should be noted that, the beamforming-based environmental noise source adaptive positioning apparatus shown in fig. 2 is used for executing the method of the embodiment shown in fig. 1 of the present application, and for convenience of description, only the portion related to the embodiment of the present application is shown, and specific technical details are not disclosed, please refer to the embodiment shown in fig. 1 of the present application.
Referring to fig. 2, fig. 2 is a schematic structural diagram of an adaptive positioning apparatus for environmental noise source based on beamforming according to an embodiment of the present application. As shown in fig. 2, the apparatus includes:
the acquisition module 201 is configured to acquire an environmental recording acquired by a microphone array in a target environment, and extract a voiceprint mei-spectral feature of the environmental recording based on a mel-frequency cepstrum;
a selecting module 202, configured to determine a noise source type corresponding to each voiceprint plum spectrum feature based on a KNN algorithm, and select all noise features corresponding to the noise source type from a preset noise database;
and the positioning module 203 is configured to perform beamforming positioning on the environment audio record based on each noise feature to obtain a target noise position.
In one implementation, the obtaining module 201 includes:
the first calculation unit is used for calculating a fast Fourier transform spectrum corresponding to the environment sound recording;
and the second calculating unit is used for introducing the fast Fourier transform frequency spectrum into a Mel frequency filter to obtain the voiceprint plum spectrum characteristic.
In one possible implementation, the selecting module 202 includes:
the first acquisition unit is used for acquiring a classified sample voiceprint feature plane corresponding to the preset noise database and mapping each voiceprint plum spectrum feature to the classified sample voiceprint feature plane;
the third calculating unit is used for calculating the distance between the mapped voiceprint plum spectrum features and the voiceprint features of the samples in the classified sample voiceprint feature plane, and selecting a first preset number of first sample voiceprint features according to the sequence of the distance from small to large;
and the counting unit is used for counting the number of each noise source type in the first sample voiceprint characteristic, and determining the noise source type with the maximum number as the noise source type corresponding to the voiceprint plum spectrum characteristic.
In one possible implementation, the positioning module 203 includes:
the screening unit is used for filtering the environment recording based on each noise characteristic, removing the sound characteristic which is not matched with each noise characteristic in the environment recording, and obtaining the noise recording;
and the positioning unit is used for carrying out beam forming positioning on the noise record to obtain a target noise position.
In one embodiment, the positioning unit comprises:
a computing element for computing a first relative reception delay for the noise recording between microphones of the microphone array based on a cross-correlation method;
the first determining element is used for determining a target plane corresponding to the noise recording and dividing the target plane into a second preset number of area positions;
an analog element for respectively simulating a second relative reception delay between each of the microphones with respect to the zone position;
and a second determining element, configured to determine a target second relative receiving delay with a smallest difference from the first relative receiving delay, where a target region position corresponding to the target second relative receiving delay is a target noise position.
In one embodiment, the apparatus further comprises:
and the combination module is used for acquiring an environment image acquired by the monitoring dome camera in the target environment and generating sound image evidence information of the noise source by combining the environment image and the second noise position.
It is clear to a person skilled in the art that the solution according to the embodiments of the present application can be implemented by means of software and/or hardware. The "unit" and "module" in this specification refer to software and/or hardware that can perform a specific function independently or in cooperation with other components, where the hardware may be, for example, a Field-Programmable Gate Array (FPGA), an Integrated Circuit (IC), or the like.
Each processing unit and/or module in the embodiments of the present application may be implemented by an analog circuit that implements the functions described in the embodiments of the present application, or may be implemented by software that executes the functions described in the embodiments of the present application.
Referring to fig. 3, a schematic structural diagram of an electronic device according to an embodiment of the present application is shown, where the electronic device may be used to implement the method in the embodiment shown in fig. 1. As shown in fig. 3, the electronic device 300 may include: at least one central processor 301, at least one network interface 304, a user interface 303, a memory 305, at least one communication bus 302.
Wherein a communication bus 302 is used to enable the connection communication between these components.
The user interface 303 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 303 may further include a standard wired interface and a wireless interface.
The network interface 304 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
The central processor 301 may include one or more processing cores. The central processor 301 connects various parts within the entire electronic device 300 using various interfaces and lines, and performs various functions of the terminal 300 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 305 and calling data stored in the memory 305. Alternatively, the central Processing unit 301 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The CPU 301 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the cpu 301, but may be implemented by a single chip.
The Memory 305 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 305 includes a non-transitory computer-readable medium. The memory 305 may be used to store instructions, programs, code sets, or instruction sets. The memory 305 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 305 may alternatively be at least one storage device located remotely from the central processor 301. As shown in fig. 3, memory 305, which is a type of computer storage medium, may include an operating system, a network communication module, a user interface module, and program instructions.
In the electronic device 300 shown in fig. 3, the user interface 303 is mainly used for providing an input interface for a user to obtain data input by the user; and the central processor 301 may be configured to invoke the beamforming-based ambient noise source adaptive positioning application stored in the memory 305 and specifically perform the following operations:
acquiring an environment recording acquired by a microphone array in a target environment, and extracting a voiceprint plum spectrum characteristic of the environment recording based on a mel-frequency cepstrum;
determining a noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm, and selecting all noise characteristics corresponding to the noise source type from a preset noise database;
and carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method. The computer-readable storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, DVD, CD-ROMs, microdrive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs), or any type of media or device suitable for storing instructions and/or data.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some service interfaces, devices or units, and may be an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program, which is stored in a computer-readable memory, and the memory may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above description is only an exemplary embodiment of the present disclosure, and the scope of the present disclosure should not be limited thereby. That is, all equivalent changes and modifications made in accordance with the teachings of the present disclosure are intended to be included within the scope of the present disclosure. Embodiments of the present disclosure will be readily apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (9)

1. A beamforming-based adaptive positioning method for environmental noise sources, the method comprising:
acquiring an environment recording acquired by a microphone array in a target environment, and extracting a voiceprint plum spectrum characteristic of the environment recording based on a mel-frequency cepstrum;
determining a noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm, and selecting all noise characteristics corresponding to the noise source type from a preset noise database;
and carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
2. The method of claim 1, wherein the extracting the voiceprint mei spectral features of the environmental audio recording based on the mel-frequency cepstrum comprises:
calculating a fast Fourier transform spectrum corresponding to the environment sound recording;
and introducing the fast Fourier transform frequency spectrum into a Mel frequency filter to obtain the voiceprint Mel spectrum characteristics.
3. The method according to claim 1, wherein the determining a noise source class corresponding to each voiceprint plum spectral feature based on a KNN algorithm comprises:
obtaining a classified sample voiceprint feature plane corresponding to a preset noise database, and mapping each voiceprint plum spectrum feature to the classified sample voiceprint feature plane;
calculating the distance between the mapped voiceprint plum spectrum features and the voiceprint features of the samples in the classified sample voiceprint feature plane, and selecting a first preset number of first sample voiceprint features according to the sequence of the distance from small to large;
and counting the number of each noise source type in the first sample voiceprint characteristic, and determining the noise source type with the maximum number as the noise source type corresponding to the voiceprint plum spectrum characteristic.
4. The method of claim 1, wherein the beamforming positioning the environmental audio record based on each of the noise features to obtain a target noise location comprises:
filtering the environment recording based on each noise feature, and removing sound features which are not matched with each noise feature in the environment recording to obtain noise recording;
and carrying out beam forming positioning on the noise record to obtain a target noise position.
5. The method of claim 4, wherein the beamforming positioning the noise recording to obtain a target noise position comprises:
calculating a first relative reception delay between microphones of the microphone array for the noise recording based on a cross-correlation method;
determining a target plane corresponding to the noise recording, and dividing the target plane into a second preset number of area positions;
simulating a second relative reception delay between each of the microphones for the zone location, respectively;
and determining a target second relative receiving delay with the minimum difference with the first relative receiving delay, wherein the target area position corresponding to the target second relative receiving delay is the target noise position.
6. The method of claim 1, further comprising:
and acquiring an environment image acquired by the monitoring dome camera in the target environment, and generating sound image evidence information of the noise source by combining the environment image and the second noise position.
7. An apparatus for adaptive positioning of ambient noise sources based on beamforming, the apparatus comprising:
the acquisition module is used for acquiring an environment recording acquired by a microphone array in a target environment and extracting a voiceprint mei-spectrum feature of the environment recording based on a mel cepstrum;
the selection module is used for determining the noise source type corresponding to each voiceprint plum spectrum characteristic based on a KNN algorithm and selecting all noise characteristics corresponding to the noise source type from a preset noise database;
and the positioning module is used for carrying out beam forming positioning on the environment sound recording based on each noise characteristic to obtain a target noise position.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1-6 are implemented when the computer program is executed by the processor.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN202210778085.0A 2022-07-04 2022-07-04 Self-adaptive positioning method and device for environmental noise source based on beam forming Active CN114863943B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210778085.0A CN114863943B (en) 2022-07-04 2022-07-04 Self-adaptive positioning method and device for environmental noise source based on beam forming

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210778085.0A CN114863943B (en) 2022-07-04 2022-07-04 Self-adaptive positioning method and device for environmental noise source based on beam forming

Publications (2)

Publication Number Publication Date
CN114863943A true CN114863943A (en) 2022-08-05
CN114863943B CN114863943B (en) 2022-11-04

Family

ID=82625942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210778085.0A Active CN114863943B (en) 2022-07-04 2022-07-04 Self-adaptive positioning method and device for environmental noise source based on beam forming

Country Status (1)

Country Link
CN (1) CN114863943B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115547356A (en) * 2022-11-25 2022-12-30 杭州兆华电子股份有限公司 Wind noise processing method and system based on abnormal sound detection of unmanned aerial vehicle

Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5581620A (en) * 1994-04-21 1996-12-03 Brown University Research Foundation Methods and apparatus for adaptive beamforming
US20100150364A1 (en) * 2008-12-12 2010-06-17 Nuance Communications, Inc. Method for Determining a Time Delay for Time Delay Compensation
US20120076316A1 (en) * 2010-09-24 2012-03-29 Manli Zhu Microphone Array System
US20120140947A1 (en) * 2010-12-01 2012-06-07 Samsung Electronics Co., Ltd Apparatus and method to localize multiple sound sources
US20130039503A1 (en) * 2011-08-11 2013-02-14 Broadcom Corporation Beamforming apparatus and method based on long-term properties of sources of undesired noise affecting voice quality
US20130147835A1 (en) * 2011-12-09 2013-06-13 Hyundai Motor Company Technique for localizing sound source
US20170004818A1 (en) * 2015-07-01 2017-01-05 zPillow, Inc. Noise cancelation system and techniques
CN206573209U (en) * 2017-01-25 2017-10-20 大连理工大学 A kind of Noise Sources Identification system based on Phase conjugation theory
CN107547981A (en) * 2017-05-17 2018-01-05 宁波桑德纳电子科技有限公司 A kind of audio collecting device, supervising device and collection sound method
US20180033447A1 (en) * 2016-08-01 2018-02-01 Apple Inc. Coordination of beamformers for noise estimation and noise suppression
CN108538320A (en) * 2018-03-30 2018-09-14 广东欧珀移动通信有限公司 Recording control method and device, readable storage medium storing program for executing, terminal
CN108760034A (en) * 2018-05-21 2018-11-06 广西电网有限责任公司电力科学研究院 A kind of transformer vibration noise source positioning system and method
CN110691299A (en) * 2019-08-29 2020-01-14 科大讯飞(苏州)科技有限公司 Audio processing system, method, apparatus, device and storage medium
CN110767226A (en) * 2019-10-30 2020-02-07 山西见声科技有限公司 Sound source positioning method and device with high accuracy, voice recognition method and system, storage equipment and terminal
CN111175698A (en) * 2020-01-18 2020-05-19 国网山东省电力公司菏泽供电公司 Transformer noise source positioning method, system and device based on sound and vibration combination
CN111880148A (en) * 2020-08-07 2020-11-03 北京字节跳动网络技术有限公司 Sound source positioning method, device, equipment and storage medium
CN112098939A (en) * 2020-09-18 2020-12-18 广东电网有限责任公司电力科学研究院 Method and device for identifying and evaluating noise pollution source
CN112463103A (en) * 2019-09-06 2021-03-09 北京声智科技有限公司 Sound pickup method, sound pickup device, electronic device and storage medium
CN112687294A (en) * 2020-12-21 2021-04-20 重庆科技学院 Vehicle-mounted noise identification method
CN112966560A (en) * 2021-02-03 2021-06-15 郑州大学 Electric spindle fault diagnosis method and device based on deconvolution imaging
CN113689873A (en) * 2021-09-07 2021-11-23 联想(北京)有限公司 Noise suppression method, device, electronic equipment and storage medium
US20210400399A1 (en) * 2020-06-18 2021-12-23 Sivantos Pte. Ltd. Hearing aid system including at least one hearing aid instrument worn on a user's head and method for operating such a hearing aid system
CN114355290A (en) * 2022-03-22 2022-04-15 杭州兆华电子股份有限公司 Sound source three-dimensional imaging method and system based on stereo array
CN114509162A (en) * 2022-04-18 2022-05-17 四川三元环境治理股份有限公司 Sound environment data monitoring method and system

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5581620A (en) * 1994-04-21 1996-12-03 Brown University Research Foundation Methods and apparatus for adaptive beamforming
US20100150364A1 (en) * 2008-12-12 2010-06-17 Nuance Communications, Inc. Method for Determining a Time Delay for Time Delay Compensation
US20120076316A1 (en) * 2010-09-24 2012-03-29 Manli Zhu Microphone Array System
US20120140947A1 (en) * 2010-12-01 2012-06-07 Samsung Electronics Co., Ltd Apparatus and method to localize multiple sound sources
US20130039503A1 (en) * 2011-08-11 2013-02-14 Broadcom Corporation Beamforming apparatus and method based on long-term properties of sources of undesired noise affecting voice quality
US20130147835A1 (en) * 2011-12-09 2013-06-13 Hyundai Motor Company Technique for localizing sound source
US20170004818A1 (en) * 2015-07-01 2017-01-05 zPillow, Inc. Noise cancelation system and techniques
US20180033447A1 (en) * 2016-08-01 2018-02-01 Apple Inc. Coordination of beamformers for noise estimation and noise suppression
CN206573209U (en) * 2017-01-25 2017-10-20 大连理工大学 A kind of Noise Sources Identification system based on Phase conjugation theory
CN107547981A (en) * 2017-05-17 2018-01-05 宁波桑德纳电子科技有限公司 A kind of audio collecting device, supervising device and collection sound method
CN108538320A (en) * 2018-03-30 2018-09-14 广东欧珀移动通信有限公司 Recording control method and device, readable storage medium storing program for executing, terminal
CN108760034A (en) * 2018-05-21 2018-11-06 广西电网有限责任公司电力科学研究院 A kind of transformer vibration noise source positioning system and method
CN110691299A (en) * 2019-08-29 2020-01-14 科大讯飞(苏州)科技有限公司 Audio processing system, method, apparatus, device and storage medium
CN112463103A (en) * 2019-09-06 2021-03-09 北京声智科技有限公司 Sound pickup method, sound pickup device, electronic device and storage medium
CN110767226A (en) * 2019-10-30 2020-02-07 山西见声科技有限公司 Sound source positioning method and device with high accuracy, voice recognition method and system, storage equipment and terminal
CN111175698A (en) * 2020-01-18 2020-05-19 国网山东省电力公司菏泽供电公司 Transformer noise source positioning method, system and device based on sound and vibration combination
US20210400399A1 (en) * 2020-06-18 2021-12-23 Sivantos Pte. Ltd. Hearing aid system including at least one hearing aid instrument worn on a user's head and method for operating such a hearing aid system
CN111880148A (en) * 2020-08-07 2020-11-03 北京字节跳动网络技术有限公司 Sound source positioning method, device, equipment and storage medium
CN112098939A (en) * 2020-09-18 2020-12-18 广东电网有限责任公司电力科学研究院 Method and device for identifying and evaluating noise pollution source
CN112687294A (en) * 2020-12-21 2021-04-20 重庆科技学院 Vehicle-mounted noise identification method
CN112966560A (en) * 2021-02-03 2021-06-15 郑州大学 Electric spindle fault diagnosis method and device based on deconvolution imaging
CN113689873A (en) * 2021-09-07 2021-11-23 联想(北京)有限公司 Noise suppression method, device, electronic equipment and storage medium
CN114355290A (en) * 2022-03-22 2022-04-15 杭州兆华电子股份有限公司 Sound source three-dimensional imaging method and system based on stereo array
CN114509162A (en) * 2022-04-18 2022-05-17 四川三元环境治理股份有限公司 Sound environment data monitoring method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨洋等: "除自谱的互谱矩阵波束形成的噪声源识别技术", 《噪声与振动控制》 *
蒋伟康等: "城市轨道交通噪声的声源特性研究进展", 《环境污染与防治》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115547356A (en) * 2022-11-25 2022-12-30 杭州兆华电子股份有限公司 Wind noise processing method and system based on abnormal sound detection of unmanned aerial vehicle
CN115547356B (en) * 2022-11-25 2023-03-10 杭州兆华电子股份有限公司 Wind noise processing method and system based on abnormal sound detection of unmanned aerial vehicle

Also Published As

Publication number Publication date
CN114863943B (en) 2022-11-04

Similar Documents

Publication Publication Date Title
US11657798B2 (en) Methods and apparatus to segment audio and determine audio segment similarities
CN110880329B (en) Audio identification method and equipment and storage medium
CN110875060A (en) Voice signal processing method, device, system, equipment and storage medium
US20160187453A1 (en) Method and device for a mobile terminal to locate a sound source
CN111077496B (en) Voice processing method and device based on microphone array and terminal equipment
CN109637525B (en) Method and apparatus for generating an on-board acoustic model
WO2021000498A1 (en) Composite speech recognition method, device, equipment, and computer-readable storage medium
CN114863943B (en) Self-adaptive positioning method and device for environmental noise source based on beam forming
US20140278415A1 (en) Voice Recognition Configuration Selector and Method of Operation Therefor
CN109102819A (en) One kind is uttered long and high-pitched sounds detection method and device
AU2022275486A1 (en) Methods and apparatus to fingerprint an audio signal via normalization
CN111868823A (en) Sound source separation method, device and equipment
CN114333881B (en) Audio transmission noise reduction method, device and medium based on environment self-adaptation
CN111385688A (en) Active noise reduction method, device and system based on deep learning
CN113327628A (en) Audio processing method and device, readable medium and electronic equipment
US20220212108A1 (en) Audio frequency signal processing method and apparatus, terminal and storage medium
CN114882912B (en) Method and device for testing transient defects of time domain of acoustic signal
CN114420100B (en) Voice detection method and device, electronic equipment and storage medium
CN114944152A (en) Vehicle whistling sound identification method
CN114415115B (en) Target signal frequency automatic optimization method for assisting direction of arrival positioning
CN115910107A (en) Audio data detection method, computer and readable storage medium
CN117789755A (en) Audio data detection method and device and electronic equipment
CN116959470A (en) Audio extraction method, device, equipment and storage medium
CN116129915A (en) Identity recognition method, voice quality inspection method and related equipment
CN116953604A (en) Sound source direction estimation method, head-mounted device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant