CN113311391A - Sound source positioning method, device and equipment based on microphone array and storage medium - Google Patents

Sound source positioning method, device and equipment based on microphone array and storage medium Download PDF

Info

Publication number
CN113311391A
CN113311391A CN202110452117.3A CN202110452117A CN113311391A CN 113311391 A CN113311391 A CN 113311391A CN 202110452117 A CN202110452117 A CN 202110452117A CN 113311391 A CN113311391 A CN 113311391A
Authority
CN
China
Prior art keywords
similarity
classification result
sound source
output signals
microphone array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110452117.3A
Other languages
Chinese (zh)
Inventor
陈英博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pulian International Co ltd
Original Assignee
Pulian International Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pulian International Co ltd filed Critical Pulian International Co ltd
Priority to CN202110452117.3A priority Critical patent/CN113311391A/en
Publication of CN113311391A publication Critical patent/CN113311391A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • G01S5/22Position of source determined by co-ordinating a plurality of position lines defined by path-difference measurements

Abstract

The invention relates to the technical field of sound source positioning, and discloses a sound source positioning method, a sound source positioning device, positioning equipment and a storage medium based on a microphone array, wherein the method comprises the following steps: acquiring output signals of a plurality of microphone arrays; calculating the similarity of any two output signals of different microphone arrays, and classifying all the output signals according to the similarity to obtain a classification result; and determining the number and the position information of the sound sources according to the classification result. According to the invention, the similarity of all output signals of the plurality of microphone arrays is calculated, and the output signals are classified according to the similarity, so that the position of a sound source is determined according to the classification result, and the problem that the sound source matching cannot be realized in the prior art is solved.

Description

Sound source positioning method, device and equipment based on microphone array and storage medium
Technical Field
The invention relates to the technical field of sound source positioning, in particular to a sound source positioning method, a sound source positioning device, sound source positioning equipment and a storage medium based on a microphone array.
Background
In the technical field of sound source positioning, a plurality of microphone arrays are generally adopted to position a plurality of sound sources. In the positioning process, each microphone array collects output signals of multiple sound sources, namely, multiple sound sources and multiple output signals are positioned, but in the prior art, sound source matching cannot be performed on the output signals, namely, which outputs of the microphone arrays correspond to the same sound source cannot be determined, and the position of the corresponding sound source cannot be determined.
Disclosure of Invention
The embodiment of the invention aims to provide a sound source positioning method, a sound source positioning device, positioning equipment and a storage medium based on a microphone array.
In order to achieve the above object, an embodiment of the present invention provides a sound source localization method based on a microphone array, including:
acquiring output signals of a plurality of microphone arrays;
calculating the similarity of any two output signals of different microphone arrays, and classifying all the output signals according to the similarity to obtain a classification result;
and determining the number and the position information of the sound sources according to the classification result.
Preferably, the calculating the similarity between any two output signals of different microphone arrays and classifying all the output signals according to the similarity to obtain a classification result specifically includes:
obtaining an initial classification result according to an output signal of any microphone array;
calculating the similarity between any output signal of other microphone arrays and all categories in the initial classification result, and acquiring the maximum similarity;
when the maximum similarity is larger than a preset threshold value, classifying any output signal into the category of the initial classification result corresponding to the maximum similarity;
and when the maximum similarity is smaller than a preset threshold value, updating the initial classification result according to any output signal.
Preferably, the similarity is calculated from a cross-correlation function.
Preferably, the calculating the similarity between any output signal of the other microphone array and all the categories in the initial classification result specifically includes:
according to the formula
Figure BDA0003036909470000021
Calculating a similarity r, wherein SiAn ith frequency domain signal representing the output signal corresponding to any one of the classes in the initial classification result,
Figure BDA0003036909470000022
representing the average, T, of all frequency domain signals representing the output signal corresponding to any one of the classes in the initial classification resultjA jth frequency domain signal representing any output signal of the other microphone array,
Figure BDA0003036909470000023
an average of all frequency domain signals representing any of the output signals of the other microphone arrays.
Preferably, the determining the number and the position information of the sound sources according to the classification result specifically includes:
determining the number of sound sources according to the category number of the classification result;
and determining the position information of the corresponding sound source according to the output signal corresponding to each category.
Preferably, before the determining the number of sound sources according to the number of categories of the classification result, the method further includes:
and deleting the categories of which the number of the output signals in the classification result is less than the preset number.
Another embodiment of the present invention provides a sound source localization apparatus based on a microphone array, including:
the signal acquisition module is used for acquiring output signals of a plurality of microphone arrays;
the classification module is used for calculating the similarity of any two output signals and classifying all the output signals according to the similarity to obtain a classification result;
and the positioning module is used for determining the number and the position information of the sound sources according to the classification result.
Another embodiment of the present invention provides a microphone array based sound source localization apparatus, comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the microphone array based sound source localization method as described in any one of the above items when executing the computer program.
Another embodiment of the present invention provides a computer-readable storage medium comprising a stored computer program, wherein the computer program, when executed, controls an apparatus on which the computer-readable storage medium is located to perform any one of the above-mentioned microphone array based sound source localization methods.
Compared with the prior art, the sound source positioning method, the sound source positioning device, the sound source positioning equipment and the storage medium based on the microphone arrays provided by the embodiment of the invention calculate the similarity of all output signals of the microphone arrays and classify the output signals according to the similarity, so that which output signals correspond to the same sound source is determined, and the problem that the sound source matching cannot be realized in the prior art is solved. Meanwhile, after the output signal corresponding to each sound source is determined, the position information of the sound source can be determined according to the corresponding output signal, and the sound source can be accurately positioned.
Drawings
Fig. 1 is a schematic flowchart of a sound source localization method based on a microphone array according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a microphone array for sound source localization according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a sound source localization apparatus based on a microphone array according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a sound source localization apparatus based on a microphone array according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, it is a schematic flowchart of a sound source localization method based on a microphone array according to the embodiment of the present invention, where the method includes steps S1 to S3:
s1, acquiring output signals of a plurality of microphone arrays;
s2, calculating the similarity of any two output signals of different microphone arrays, and classifying all the output signals according to the similarity to obtain a classification result;
and S3, determining the number and the position information of the sound sources according to the classification result.
It should be noted that one microphone array can monitor a plurality of sound sources in a space, and when a plurality of microphone arrays exist in the space, the whole microphone array system outputs many signals, but the signals are not classified and matched, and it is not known which output signals correspond to the same sound source. For convenience of understanding, the embodiment of the present invention provides a schematic diagram of a microphone array for sound source localization, and specifically refers to fig. 2. As can be seen from fig. 2, the positioning of the first microphone array to three sound sources outputs three output signals, i.e., three dashed lines from O1 in fig. 2, each dashed line representing a sound source signal. Each array may be positioned to a different number of signals. For example, when O1 and O3 are far away and the sound source P2 is far away from O3, O3 cannot locate P2 and can only locate P1 and P3. The total number of output signals of the three microphone arrays in fig. 2 is 8, but the prior art cannot distinguish which output signals correspond to the same sound source, and the present invention aims to solve the technical problem.
Specifically, in a multi-sound-source space, a plurality of microphone arrays are controlled to listen to the multi-sound-source space, and then output signals of the plurality of microphone arrays are acquired. Typically, each microphone array will output N output signals, one for each sound source, each output signal including pitch angle, azimuth angle and audio signal. If the number of sound sources in the space is W, N is less than or equal to W, and some sound sources may be far from a certain microphone array, so that the sound sources cannot be monitored, and corresponding signals cannot be output.
Since the respective output signals of the same microphone array correspond to different sound sources, the similarity is definitely different and may be ignored in order to reduce the amount of calculation. And calculating the similarity of any two output signals of different microphone arrays, and classifying all the output signals according to the similarity to obtain a classification result. It is noted that in calculating the similarity, the calculation is generally performed using the audio signals in the output signal, since the audio of the same sound source will be similar.
And determining the number and position information of the sound sources according to the classification result. Namely, each type of result corresponds to one sound source, and the position of the corresponding sound source can be determined according to the pitch angle and the azimuth angle in the output signal of each type of result.
The embodiment of the invention provides a sound source positioning method based on a microphone array, which classifies output signals according to the similarity by calculating the similarity of all the output signals of a plurality of microphone arrays, determines the position of a sound source according to a classification result and solves the problem that the sound source matching cannot be realized in the prior art.
As an improvement of the above scheme, the calculating a similarity between any two output signals of different microphone arrays, and classifying all the output signals according to the similarity to obtain a classification result specifically includes:
obtaining an initial classification result according to an output signal of any microphone array;
calculating the similarity between any output signal of other microphone arrays and all categories in the initial classification result, and acquiring the maximum similarity;
when the maximum similarity is larger than a preset threshold value, classifying any output signal into the category of the initial classification result corresponding to the maximum similarity;
and when the maximum similarity is smaller than a preset threshold value, updating the initial classification result according to any output signal.
Specifically, an initial classification result is obtained from the output signal of any microphone array. For example, if the first microphone array has K output signals, each output signal is taken as a class, and the initial classification result has K classes.
And calculating the similarity between any output signal of other microphone arrays and all the categories in the initial classification result, and acquiring the maximum similarity. It is to be noted that, in calculating the similarity of the output signals of the other microphone arrays to all the categories of the initial classification result, the similarity of the output signals of the other microphone arrays to the respective output signals in each category is also calculated.
When the maximum similarity is larger than a preset threshold, any output signal is classified into the category of the initial classification result corresponding to the maximum similarity, and the calculated output signal and the corresponding category are the same and correspond to the same sound source, so that the output signals need to be classified into the same category.
When the maximum similarity is smaller than the preset threshold, it indicates that the calculated output signals are not the same as the existing initial classification results, the initial classification results need to be updated according to any output signal, the output signals which are correspondingly calculated are independently used as a new class and added into the initial classification results, and in the subsequent similarity calculation of other output signals, the similarity with the new class is also calculated.
To further the understanding of this embodiment of the present invention, an example is described below. For example, the first microphone array has 3 output signals, and 3 sets, namely C1 ═ { O (1,1) }, C2 ═ O (1,2) }, and C3 ═ O (1,3) }, are established in advance. For the 1 st output signal O (2,1) of the second microphone array, the similarity of O (2,1) to each element in each existing set is calculated, and if the similarity of O (2,1) to each element in C1, C2, C3 is smaller than the threshold T, a set C4 ═ O (2,1) }iscreated for O (2, 1). For the 2 nd output signal O (2,2) of the second microphone array, the similarity between O (2,2) and O (1,1) is calculated to be greater than the threshold T, and then O (2,2) is also added to the set C1 corresponding to O (1, 1). To this end, we can get 4 sets, C1 ═ { O (1,1), O (2,2) }, C2 ═ { O (1,2) }, C3 ═ O (1,3) }, C4 ═ O (2,1) }. Similarly, the output signals of other microphone arrays are calculated according to a similar method, which is not described herein.
As an improvement of the above scheme, the similarity is calculated from a cross-correlation function.
Specifically, the similarity is calculated according to a cross-correlation function, that is, the cross-correlation value of any output signal and each output signal in each category is calculated by using the cross-correlation function, and the maximum cross-correlation value is taken as the similarity between the two corresponding output signals.
As an improvement of the above solution, the calculating the similarity between any output signal of the other microphone array and all the categories in the initial classification result specifically includes:
according to the formula
Figure BDA0003036909470000061
Calculating a similarity r, wherein SiAn ith frequency domain signal representing the output signal corresponding to any one of the classes in the initial classification result,
Figure BDA0003036909470000062
representing the average, T, of all frequency domain signals representing the output signal corresponding to any one of the classes in the initial classification resultjA jth frequency domain signal representing any output signal of the other microphone array,
Figure BDA0003036909470000063
an average of all frequency domain signals representing any of the output signals of the other microphone arrays.
Specifically, two output signals of which the similarity needs to be calculated are converted into a frequency domain through fast Fourier transform to obtain corresponding frequency domain signals, and then the corresponding frequency domain signals are obtained according to a formula
Figure BDA0003036909470000071
Calculating a similarity r corresponding to the two output signals, wherein SiAn ith frequency domain signal representing the output signal corresponding to any one of the classes in the initial classification result,i is more than or equal to 1 and less than or equal to I/2, I is the audio frequency length of the output signal corresponding to any category in the initial classification result, namely the length is I points,
Figure BDA0003036909470000072
representing the average, T, of all frequency domain signals representing the output signal corresponding to any one of the classes in the initial classification resultjJ ≦ 1 ≦ J/2, J being the audio length of any output signal of the other microphone array, i.e. the length of J points,
Figure BDA0003036909470000073
the average value of all frequency domain signals representing any output signal of other microphone arrays is 0 ≦ r ≦ 1, and the larger r is, the more similar the two output signals are.
As an improvement of the above scheme, the determining the number and the position information of the sound sources according to the classification result specifically includes:
determining the number of sound sources according to the category number of the classification result;
and determining the position information of the corresponding sound source according to the output signal corresponding to each category.
Specifically, the number of sound sources is determined according to the number of categories of the classification result. Generally, the number of sound sources is equal to the number of categories.
And determining the position information of the corresponding sound source according to the output signal corresponding to each category. Generally, position information of a corresponding sound source is determined according to a pitch angle and an azimuth angle in an output signal.
As an improvement of the above solution, before the determining the number of sound sources according to the number of categories of the classification result, the method further includes:
and deleting the categories of which the number of the output signals in the classification result is less than the preset number.
Specifically, the categories of which the number of output signals in the classification result is less than the preset number are deleted. Optionally, the preset number is at least three, and the class with only 1 output signal must be deleted, because at least two output signals can implement the triangulation of the sound source, but only two output signals perform the triangulation, the error of the positioning result may be large, and in order to make the positioning more accurate, only the class with three output signals and more than three output signals is reserved.
After the unsatisfactory classes are deleted, the position information of the corresponding sound source can be determined according to the remaining classes. Constructing a cost function from the remaining output signals of any of the classes
Figure BDA0003036909470000081
And solving the cost function to obtain the spatial coordinates of the corresponding sound source. Wherein the content of the first and second substances,
Figure BDA0003036909470000082
Figure BDA0003036909470000083
Figure BDA0003036909470000084
to output signals
Figure BDA0003036909470000085
The straight line of the point-to-be-pointed,
Figure BDA0003036909470000086
for the mth output signal in either category,
Figure BDA0003036909470000087
to a pitch angle, θmM is more than or equal to 1 and less than or equal to M, and M is the total number of all output signals in any category; pm=HmP,P=(x,y,z),HmA spatial transformation matrix of the microphone array corresponding to the mth output signal relative to the world coordinate system, wherein (x, y, z) is the coordinate of the sound source point P in the world coordinate system, and P is the coordinate of the sound source point P in the world coordinate systemmThe coordinates of the sound source point P under the array coordinate system of the microphone array corresponding to the mth output signal; dmTo output signals
Figure BDA0003036909470000088
Distance from sound source point PSeparating; n is a preset norm. Optionally, n is 2, and the corresponding solution method is a least square method; when n is 1, the corresponding solving method is a gradient descent method.
Referring to fig. 3, it is a schematic structural diagram of a sound source localization apparatus based on a microphone array according to the embodiment of the present invention, where the apparatus includes:
a signal acquisition module 11, configured to acquire output signals of a plurality of microphone arrays;
the classification module 12 is configured to calculate similarity between any two output signals, and classify all the output signals according to the similarity to obtain a classification result;
and the positioning module 13 is configured to determine the number and the position information of the sound sources according to the classification result.
Preferably, the classification module 12 specifically includes:
the initial classification unit is used for obtaining an initial classification result according to an output signal of any microphone array;
the calculating unit is used for calculating the similarity between any output signal of other microphone arrays and all categories in the initial classification result and acquiring the maximum similarity;
the dividing unit is used for classifying any output signal into the category of the initial classification result corresponding to the maximum similarity when the maximum similarity is larger than a preset threshold;
and the updating unit is used for updating the initial classification result according to any output signal when the maximum similarity is smaller than a preset threshold value.
Preferably, the similarity is calculated from a cross-correlation function.
Preferably, the computing unit specifically includes:
a similarity operator unit for calculating a similarity according to a formula
Figure BDA0003036909470000091
Calculating a similarity r, wherein SiAn ith frequency domain signal representing the output signal corresponding to any one of the classes in the initial classification result,
Figure BDA0003036909470000092
representing the average, T, of all frequency domain signals representing the output signal corresponding to any one of the classes in the initial classification resultjA jth frequency domain signal representing any output signal of the other microphone array,
Figure BDA0003036909470000093
an average of all frequency domain signals representing any of the output signals of the other microphone arrays.
Preferably, the positioning module 13 specifically includes:
a sound source number determination unit for determining the number of sound sources according to the classification number of the classification result;
and the sound source positioning unit is used for determining the position information of the corresponding sound source according to the output signal corresponding to each category.
Preferably, the positioning module 13 further comprises:
and the deleting unit is used for deleting the categories of which the number of the output signals in the classification result is less than the preset number.
The sound source positioning device based on the microphone array provided by the embodiment of the invention can realize all the processes of the sound source positioning method based on the microphone array described in any one of the embodiments, and the functions and the realized technical effects of each module and unit in the device are respectively the same as the functions and the realized technical effects of the sound source positioning method based on the microphone array described in the embodiment, and are not repeated herein.
Referring to fig. 4, it is a schematic diagram of a microphone array based sound source positioning apparatus provided by the embodiment of the present invention, the positioning apparatus includes a processor 10, a memory 20, and a computer program stored in the memory 20 and configured to be executed by the processor 10, and when the processor 10 executes the computer program, the microphone array based sound source positioning method described in any of the above embodiments is implemented.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 20 and executed by the processor 10 to implement the present invention. One or more of the modules/elements may be a series of computer program instruction segments capable of performing specific functions describing the execution of a computer program in a microphone array based sound source localization. For example, the computer program may be divided into a signal acquisition module, a classification module and a positioning module, and each module has the following specific functions:
a signal acquisition module 11, configured to acquire output signals of a plurality of microphone arrays;
the classification module 12 is configured to calculate similarity between any two output signals, and classify all the output signals according to the similarity to obtain a classification result;
and the positioning module 13 is configured to determine the number and the position information of the sound sources according to the classification result.
The positioning device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The positioning device may include, but is not limited to, a processor, a memory. Those skilled in the art will appreciate that the schematic diagram 4 is merely an example of a pointing device and is not intended to be limiting, and may include more or fewer components than those shown, or some components may be combined, or different components, for example, the pointing device may also include input output devices, network access devices, buses, etc.
The Processor 10 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The general purpose processor may be a microprocessor or the processor 10 may be any conventional processor or the like, the processor 10 being the control center for the pointing device and utilizing various interfaces and lines to connect the various parts of the entire pointing device.
The memory 20 may be used to store the computer programs and/or modules, and the processor 10 implements the various functions of the positioning device by running or executing the computer programs and/or modules stored in the memory 20 and invoking data stored in the memory 20. The memory 20 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory 20 may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
Wherein the module integrated with the positioning device can be stored in a computer readable storage medium if it is implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium and can implement the steps of the embodiments of the method when the computer program is executed by a processor. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file or some intermediate form. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U.S. disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, in accordance with legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunications signals.
The embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, the apparatus on which the computer-readable storage medium is located is controlled to perform the sound source positioning method based on a microphone array according to any of the above embodiments.
In summary, the sound source positioning method, device, positioning apparatus and storage medium based on a microphone array provided in the embodiments of the present invention calculate the similarity of all output signals of a plurality of microphone arrays, and classify the output signals according to the similarity, thereby determining which output signals correspond to the same sound source, and solving the problem that the sound source matching cannot be implemented in the prior art. Meanwhile, after the output signal corresponding to each sound source is determined, the position information of the sound source can be determined according to the corresponding output signal, and the sound source can be accurately positioned.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (9)

1. A sound source positioning method based on a microphone array is characterized by comprising the following steps:
acquiring output signals of a plurality of microphone arrays;
calculating the similarity of any two output signals of different microphone arrays, and classifying all the output signals according to the similarity to obtain a classification result;
and determining the number and the position information of the sound sources according to the classification result.
2. The method for positioning a sound source based on a microphone array according to claim 1, wherein the calculating a similarity between any two output signals of different microphone arrays and classifying all the output signals according to the similarity to obtain a classification result comprises:
obtaining an initial classification result according to an output signal of any microphone array;
calculating the similarity between any output signal of other microphone arrays and all categories in the initial classification result, and acquiring the maximum similarity;
when the maximum similarity is larger than a preset threshold value, classifying any output signal into the category of the initial classification result corresponding to the maximum similarity;
and when the maximum similarity is smaller than a preset threshold value, updating the initial classification result according to any output signal.
3. The microphone array-based sound source localization method of claim 1, wherein the similarity is calculated according to a cross-correlation function.
4. The method as claimed in claim 2, wherein the calculating the similarity between any output signal of other microphone arrays and all classes in the initial classification result comprises:
according to the formula
Figure FDA0003036909460000021
Calculating a similarity r, wherein SiAn ith frequency domain signal representing the output signal corresponding to any one of the classes in the initial classification result,
Figure FDA0003036909460000022
representing the average, T, of all frequency domain signals representing the output signal corresponding to any one of the classes in the initial classification resultjA jth frequency domain signal representing any output signal of the other microphone array,
Figure FDA0003036909460000023
an average of all frequency domain signals representing any of the output signals of the other microphone arrays.
5. The sound source localization method based on a microphone array according to claim 1, wherein the determining the number and location information of the sound sources according to the classification result specifically comprises:
determining the number of sound sources according to the category number of the classification result;
and determining the position information of the corresponding sound source according to the output signal corresponding to each category.
6. The microphone array-based sound source localization method of claim 5, further comprising, before the determining the number of sound sources according to the number of categories of the classification result:
and deleting the categories of which the number of the output signals in the classification result is less than the preset number.
7. A sound source localization apparatus based on a microphone array, comprising:
the signal acquisition module is used for acquiring output signals of a plurality of microphone arrays;
the classification module is used for calculating the similarity of any two output signals and classifying all the output signals according to the similarity to obtain a classification result;
and the positioning module is used for determining the number and the position information of the sound sources according to the classification result.
8. A microphone array based sound source localization device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor when executing the computer program implementing a microphone array based sound source localization method according to any of claims 1 to 6.
9. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus on which the computer-readable storage medium is located to perform a microphone array based sound source localization method according to any one of claims 1 to 6.
CN202110452117.3A 2021-04-25 2021-04-25 Sound source positioning method, device and equipment based on microphone array and storage medium Pending CN113311391A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110452117.3A CN113311391A (en) 2021-04-25 2021-04-25 Sound source positioning method, device and equipment based on microphone array and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110452117.3A CN113311391A (en) 2021-04-25 2021-04-25 Sound source positioning method, device and equipment based on microphone array and storage medium

Publications (1)

Publication Number Publication Date
CN113311391A true CN113311391A (en) 2021-08-27

Family

ID=77371150

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110452117.3A Pending CN113311391A (en) 2021-04-25 2021-04-25 Sound source positioning method, device and equipment based on microphone array and storage medium

Country Status (1)

Country Link
CN (1) CN113311391A (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100303254A1 (en) * 2007-10-01 2010-12-02 Shinichi Yoshizawa Audio source direction detecting device
US9560446B1 (en) * 2012-06-27 2017-01-31 Amazon Technologies, Inc. Sound source locator with distributed microphone array
CN106782563A (en) * 2016-12-28 2017-05-31 上海百芝龙网络科技有限公司 A kind of intelligent home voice interactive system
JP2017097101A (en) * 2015-11-20 2017-06-01 富士通株式会社 Noise rejection device, noise rejection program, and noise rejection method
US20170263126A1 (en) * 2016-03-10 2017-09-14 Hyundai Motor Company Method for Providing Sound Detection Information, Apparatus Detecting Sound Around A Vehicle, and A Vehicle Including the Same
CN109239665A (en) * 2018-07-10 2019-01-18 北京大学深圳研究生院 A kind of more sound source consecutive tracking method and apparatus based on signal subspace similarity spectrum and particle filter
CN110148422A (en) * 2019-06-11 2019-08-20 南京地平线集成电路有限公司 The method, apparatus and electronic equipment of sound source information are determined based on microphone array
CN110782911A (en) * 2018-07-30 2020-02-11 阿里巴巴集团控股有限公司 Audio signal processing method, apparatus, device and storage medium
WO2020066542A1 (en) * 2018-09-26 2020-04-02 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Acoustic object extraction device and acoustic object extraction method
CN111880148A (en) * 2020-08-07 2020-11-03 北京字节跳动网络技术有限公司 Sound source positioning method, device, equipment and storage medium
CN111919252A (en) * 2018-03-29 2020-11-10 索尼公司 Sound source direction estimation device, sound source direction estimation method, and program
CN112034424A (en) * 2020-08-26 2020-12-04 深圳信息职业技术学院 Neural network sound source direction finding method and system based on double microphones
CN112540347A (en) * 2020-11-17 2021-03-23 普联国际有限公司 Method and device for judging distance of sound source, terminal equipment and storage medium
CN112581978A (en) * 2020-12-11 2021-03-30 平安科技(深圳)有限公司 Sound event detection and positioning method, device, equipment and readable storage medium
CN112700788A (en) * 2020-12-23 2021-04-23 普联国际有限公司 Echo path modeling method, device, equipment and storage medium in echo cancellation

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100303254A1 (en) * 2007-10-01 2010-12-02 Shinichi Yoshizawa Audio source direction detecting device
US9560446B1 (en) * 2012-06-27 2017-01-31 Amazon Technologies, Inc. Sound source locator with distributed microphone array
JP2017097101A (en) * 2015-11-20 2017-06-01 富士通株式会社 Noise rejection device, noise rejection program, and noise rejection method
US20170263126A1 (en) * 2016-03-10 2017-09-14 Hyundai Motor Company Method for Providing Sound Detection Information, Apparatus Detecting Sound Around A Vehicle, and A Vehicle Including the Same
CN106782563A (en) * 2016-12-28 2017-05-31 上海百芝龙网络科技有限公司 A kind of intelligent home voice interactive system
CN111919252A (en) * 2018-03-29 2020-11-10 索尼公司 Sound source direction estimation device, sound source direction estimation method, and program
CN109239665A (en) * 2018-07-10 2019-01-18 北京大学深圳研究生院 A kind of more sound source consecutive tracking method and apparatus based on signal subspace similarity spectrum and particle filter
CN110782911A (en) * 2018-07-30 2020-02-11 阿里巴巴集团控股有限公司 Audio signal processing method, apparatus, device and storage medium
WO2020066542A1 (en) * 2018-09-26 2020-04-02 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Acoustic object extraction device and acoustic object extraction method
CN110148422A (en) * 2019-06-11 2019-08-20 南京地平线集成电路有限公司 The method, apparatus and electronic equipment of sound source information are determined based on microphone array
CN111880148A (en) * 2020-08-07 2020-11-03 北京字节跳动网络技术有限公司 Sound source positioning method, device, equipment and storage medium
CN112034424A (en) * 2020-08-26 2020-12-04 深圳信息职业技术学院 Neural network sound source direction finding method and system based on double microphones
CN112540347A (en) * 2020-11-17 2021-03-23 普联国际有限公司 Method and device for judging distance of sound source, terminal equipment and storage medium
CN112581978A (en) * 2020-12-11 2021-03-30 平安科技(深圳)有限公司 Sound event detection and positioning method, device, equipment and readable storage medium
CN112700788A (en) * 2020-12-23 2021-04-23 普联国际有限公司 Echo path modeling method, device, equipment and storage medium in echo cancellation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王硕朋;杨鹏;孙昊;: "基于声音位置指纹的室内声源定位方法", 北京工业大学学报, no. 02, 10 February 2017 (2017-02-10) *

Similar Documents

Publication Publication Date Title
CN109376596B (en) Face matching method, device, equipment and storage medium
CN108960090B (en) Video image processing method and device, computer readable medium and electronic equipment
CN109656366B (en) Emotional state identification method and device, computer equipment and storage medium
CN112488297B (en) Neural network pruning method, model generation method and device
CN111312295B (en) Holographic sound recording method and device and recording equipment
CN112990318B (en) Continuous learning method, device, terminal and storage medium
CN113892113A (en) Human body posture estimation method and device
CN114491399A (en) Data processing method and device, terminal equipment and computer readable storage medium
CN111311593A (en) Multi-ellipse detection and evaluation algorithm, device, terminal and readable storage medium based on image gradient information
CN113311390A (en) Sound source positioning method, device, equipment and storage medium based on distributed wheat array
CN113311391A (en) Sound source positioning method, device and equipment based on microphone array and storage medium
CN110895329B (en) Hybrid distribution model clutter map target detection method and device
US20220207892A1 (en) Method and device for classifing densities of cells, electronic device using method, and storage medium
DE102022120731A1 (en) MULTIMODAL SENSOR FUSION FOR CONTENT IDENTIFICATION IN HUMAN-MACHINE INTERFACE APPLICATIONS
CN111767710B (en) Indonesia emotion classification method, device, equipment and medium
US20210073580A1 (en) Method and apparatus for obtaining product training images, and non-transitory computer-readable storage medium
CN114463512A (en) Point cloud data processing method, vectorization method and device
CN113791386A (en) Method, device and equipment for positioning sound source and computer readable storage medium
CN109614854B (en) Video data processing method and device, computer device and readable storage medium
CN113868939A (en) Wind power probability density evaluation method, device, equipment and medium
CN112816959A (en) Clustering method, device, equipment and storage medium for vehicles
CN113312971A (en) Parameter calibration method and device for microphone array, terminal equipment and storage medium
CN110852767A (en) Passenger flow volume clustering method and terminal equipment
CN113160942A (en) Image data quality evaluation method and device, terminal equipment and readable storage medium
CN113361511A (en) Method, device and equipment for establishing correction model and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination