CN113419216A - Multi-sound-source positioning method suitable for reverberation environment - Google Patents

Multi-sound-source positioning method suitable for reverberation environment Download PDF

Info

Publication number
CN113419216A
CN113419216A CN202110684270.9A CN202110684270A CN113419216A CN 113419216 A CN113419216 A CN 113419216A CN 202110684270 A CN202110684270 A CN 202110684270A CN 113419216 A CN113419216 A CN 113419216A
Authority
CN
China
Prior art keywords
coordinate
group
coordinates
expressed
microphone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110684270.9A
Other languages
Chinese (zh)
Other versions
CN113419216B (en
Inventor
胡秋岑
吴礼福
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202110684270.9A priority Critical patent/CN113419216B/en
Publication of CN113419216A publication Critical patent/CN113419216A/en
Application granted granted Critical
Publication of CN113419216B publication Critical patent/CN113419216B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • G01S5/20Position of source determined by a plurality of spaced direction-finders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention provides a multi-sound-source positioning method suitable for a reverberation environment, which comprises the steps of grouping all coordinates of a whole search area, and calculating the central coordinate of each group; collecting a speech signal using a 16 microphone array; a sound source is positioned by using a multi-sound-source positioning algorithm of double-layer search space clustering (TL-SSC), and coordinates near the sound source are removed from a search area; this operation is repeated until all sound sources have been located. The method solves the problem of multi-sound-source real-time positioning under the reverberation condition, requires fewer microphones compared with other multi-sound-source positioning methods, can improve the calculation efficiency while keeping high positioning accuracy, and meets the real-time requirement of mobile robot application.

Description

Multi-sound-source positioning method suitable for reverberation environment
Technical Field
The invention relates to the technical field of sound source positioning, in particular to a multi-sound-source positioning method suitable for a reverberation environment.
Background
The multi-sound-source positioning has wide requirements in real-time systems such as video conferences, voice recognition, mobile service robots and the like, and is always one of research hotspots in the field of sound signal processing. For example, when the mobile robot executes real-time intelligent service, the voice position is determined by a multi-sound-source positioning method, and the robot is guided to complete the service. The existing sound source positioning methods mainly comprise three types: a positioning method based on subspace, a positioning method based on controllable beam forming and a positioning method based on arrival time delay. The positioning method based on the subspace receives signals through each microphone, utilizes the orthogonality of signal subspace and noise subspace, constructs a spatial spectrum function and searches a spectrum peak to obtain the direction of a sound source, has high positioning precision, but has higher requirement on the stability of the sound source signal, and has poor positioning effect in a small space. The positioning method based on controllable beam forming is simple in principle and small in calculation amount, but the anti-noise capability is poor, environmental noise information needs to be obtained in advance, and the positioning instantaneity is difficult to guarantee. The positioning method based on the arrival time delay determines the position of the sound source by utilizing the sound path difference from the sound source to each microphone, the calculation complexity is generally less than that of the two methods, the positioning precision is higher, and the positioning real-time property is easy to meet.
Disclosure of Invention
The invention provides a multi-sound-source positioning method suitable for a reverberation environment, aiming at the application occasions with higher algorithm real-Time requirements, such as indoor mobile robots, in the prior art, namely the problem that improved Space exists when the calculation efficiency is improved while the precision is maintained as much as possible in a small Space, the invention adopts 16 microphones by a multi-sound-source positioning method based on double-layer Search Space Clustering (TL-SSC), improves the system calculation efficiency by grouping coordinates, real-Time double-layer searching, Clustering screening, threshold value judgment and the like, and realizes multi-sound-source real-Time positioning by using Time Difference of Arrival (TDOA) estimation.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for multiple sound source localization for use in reverberant environments, comprising the steps of:
s1, collecting coordinates in the whole search area, grouping the coordinates, and calculating the center coordinate of each group;
s2, collecting voice signals by using a microphone array;
s3, determining a candidate group of a certain sound source position by adopting a double-layer search space clustering multi-sound source positioning algorithm in a mode of calculating the central coordinate power of each group, positioning the sound source position in all coordinates contained in the candidate group, and removing the coordinates near the sound source in a search area;
the above operation of step S3 is repeated until all the sound source positions are located.
In order to optimize the technical scheme, the specific measures adopted further comprise:
further, the basis of the grouping in step S1 is:
Figure BDA0003124044750000021
Figure BDA0003124044750000022
Figure BDA0003124044750000023
if the ith coordinate qiBelong to the jth group zjThen p (q)i∈zj) Has a value of 1; if the ith coordinate qiNot belonging to group j zj,p(qi∈zj) Is 0;
wherein I represents the total number of coordinates in the entire search area, J represents the number of current groups, and zjExpressed as all sets of coordinates in the jth group; wherein the initial value of J is 1, and sequentially adding 1 until the formula
Figure BDA0003124044750000024
It holds that as i and j change, p (q)i∈zj)、e(qi,zj) And zjThe central coordinates of the three-dimensional image are calculated through a K-mean algorithm;
in the formula, e (q)i,zj) Representing the bunching error, defined as the difference in acoustic path between all microphone pairs
Figure BDA0003124044750000025
Summing;
Figure BDA0003124044750000026
representing the distance position q between microphone k and microphone liThe value of the TDOA value of (a),
Figure BDA0003124044750000027
representing the set z of distances between microphone k and microphone ljTDOA value of center coordinates, M representing the number of microphones, θtDenoted as threshold.
Further, the threshold value θtIs defined as:
Figure BDA0003124044750000028
wherein λ represents wavelength, c is sound velocity, and f is sampling rate; in sound source localization, θtThe value is determined by the maximum frequency of the speech signal.
Furthermore, the microphone array collects voice signals by adopting 16 microphones, is integrally cylindrical, and is uniformly distributed with 8 microphones on the upper and lower outlines.
Further, in the time domain, the method for calculating the coordinate power specifically includes:
Figure BDA0003124044750000029
wherein y (t, q) represents the output value of coordinate position q at time t, gm(t) denotes the impulse response of the filter at the m-th microphone, xm(t+τm,q) Denotes the m-th microphone at time t + τm,qReceived signal, τm,qRepresenting the signal propagation time from coordinate position q to the mth microphone; in the frequency domain, the formula for calculating the coordinate power is expressed as:
Figure BDA0003124044750000031
in the formula, Y (ω, q) is an output value of the coordinate position q at the frequency ω, and Xm(ω) Fourier transform, G, of the mth microphone signalm(ω) represents the frequency domain system function of the filter at the mth microphone;
based on a formula for calculating coordinate power in a frequency domain, obtaining a power output value P (q) of a coordinate position q as follows:
Figure BDA0003124044750000032
in the formula, Gl(ω) is expressed as a frequency domain system function of the filter at the l-th microphone, Xl(ω) is expressed as the fourier transform of the l-th microphone signal,
Figure BDA0003124044750000033
expressed as the conjugate of the frequency domain system function of the filter at the kth microphone,
Figure BDA0003124044750000034
expressed as the conjugate of the Fourier transform of the kth microphone signal, τk,qRepresenting the signal propagation time from coordinate position q to the kth microphone;
in the formula (I), the compound is shown in the specification,
Figure BDA0003124044750000035
Figure BDA0003124044750000036
a PHAT weighting coefficient between the ith microphone signal and the kth microphone signal;
after the power of the central coordinate of each group is calculated, a candidate group is determined, the position of a certain sound source is positioned in all coordinates contained in the candidate group, and the position of the sound source is determined as the maximum power value
Figure BDA0003124044750000037
The corresponding coordinates, namely:
Figure BDA0003124044750000038
further, the specific way of determining a certain sound source candidate group is as follows: searching according to the result of the power value calculation of the central coordinate of each group, selecting the group corresponding to the maximum power value as a first candidate group, and when judging the vth group in the rest groups, selecting the group as the candidate group under the condition that:
Figure BDA0003124044750000039
Figure BDA00031240447500000310
Figure BDA00031240447500000311
vc|≤θ1
Figure BDA00031240447500000312
stopping judging the candidate group when the number u of the candidate group reaches a certain number or all the groups are judged;
in the formula (X)b,Yb,Zb) Expressed as the centre coordinates of the b-th of the existing candidate groups in a Cartesian coordinate system, (X)cc,Ycc,Zcc) Expressed as the mean coordinate, theta, of the cartesian coordinate system after averaging the central coordinates of all the current candidate groupscTo average the azimuth of the mean coordinate after averaging the center coordinates of all the current candidate groups,
Figure BDA0003124044750000041
elevation angle theta of average coordinate after averaging center coordinates of all current candidate groupsvExpressed as the position of the center coordinate of the current group to be discriminatedThe azimuth angle of (a) is,
Figure BDA0003124044750000049
elevation angle, theta, expressed as the central coordinate of the current group to be discriminated1Indicated as an azimuth angle threshold value, is,
Figure BDA0003124044750000042
denoted as the elevation threshold.
Further, the specific contents of removing the coordinates near the sound source in the search area are:
a region omega is provided, and for the sub-groups contained in the region omega, the power of the coordinate position in the sub-group is uniformly reduced and a power value E is givenlMeanwhile, the small groups contained in the region omega are not considered in the subsequent step of positioning the positions of other sound sources;
coordinates in the spherical coordinate system within the region omega
Figure BDA0003124044750000043
The requirements are as follows:
|θ-θs|≤θ2
Figure BDA0003124044750000044
where theta is expressed as the azimuth angle of the coordinate,
Figure BDA0003124044750000045
elevation angle expressed as coordinates, r represents the distance of the coordinates from the origin of the coordinate system, thetasExpressed as the azimuth of the last source coordinate position of the currently located source,
Figure BDA0003124044750000046
elevation angle, theta, expressed as the last sound source coordinate position of the currently located sound source2Indicated as an azimuth angle threshold value, is,
Figure BDA0003124044750000047
denoted as the elevation threshold.
The invention has the beneficial effects that:
1. the multi-sound-source positioning method suitable for the reverberation environment selects a proper removal region and a candidate group screening condition through steps of cluster screening, threshold judgment and the like, so that the TL-SSC algorithm can be applied to a multi-sound-source system.
2. Compared with other multi-sound-source positioning methods, the multi-sound-source positioning method suitable for the reverberation environment needs fewer microphones, can improve the calculation efficiency while keeping high positioning accuracy, and meets the real-time requirement of mobile robot application.
Drawings
FIG. 1 is a schematic diagram of a microphone and sound source distribution according to the present invention; in the figure: the five-pointed star represents the microphone and the dots represent the sound source.
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings.
The positioning of two of the plurality of sound sources is taken as an example for explanation.
Grouping the primary coordinates in the whole search area, and calculating the center coordinates of each group, wherein the grouping basis is as follows:
Figure BDA0003124044750000048
Figure BDA0003124044750000051
Figure BDA0003124044750000052
wherein I represents the total number of primary coordinates in the whole search area, J represents the number of current groups, M represents the number of microphones, and theta representstDenoted as threshold. z is a radical ofjExpressed as all sets of coordinates in the jth group, if the ith coordinate qiBelong to the jth group zjThen p (q)i∈zj) Is 1; otherwise it is 0. e (q)i,zj) Representing the bunching error, defined as the difference in acoustic path between all microphone pairs
Figure BDA0003124044750000053
Summing;
Figure BDA0003124044750000054
representing the distance position q between microphone k and microphone liTo determine the TDOA value of
Figure BDA0003124044750000055
The value is obtained. The initial value of J is 1, and the value is increased by 1 each time until the formula is reached
Figure BDA0003124044750000056
This is true. As J increases, p (q) changes with i and Ji∈zj),e(qi,zj) And zjThe calculation will be performed by the K-means algorithm. The threshold θ is defined as:
Figure BDA0003124044750000057
where λ represents wavelength, c is speed of sound, and f is sampling rate; in sound source localization, θtThe value is determined by the maximum frequency of the speech signal.
A candidate subgroup of one of the sound sources (the sound source not first located) is screened out and the sound source is located. And calculating the power value corresponding to each group according to the obtained grouping result, performing first-layer search, and selecting the group corresponding to the maximum power value of the center coordinates as a first candidate group. Suppose u has been found in the first level search1The group is screened as a candidate group, and in order to continue to select the group belonging to the same sound source as the existing candidate group, the v < th > group of the group is left1When the group is judged, the conditions for selecting the group as a candidate group are as follows:
Figure BDA0003124044750000058
Figure BDA0003124044750000059
Figure BDA00031240447500000510
Figure BDA00031240447500000513
Figure BDA00031240447500000511
up to u1Until a certain number n is reached or all subgroups are discriminated. Wherein
Figure BDA00031240447500000512
Expressed as the b-th candidate group in the existing candidate group in the Cartesian coordinate system1Center coordinates of the individual groups, (X)c1,Yc1,Zc1) Expressed as the mean coordinate, theta, of the cartesian coordinate system after averaging the central coordinates of all the current candidate groupsc1To average the azimuth of the mean coordinate after averaging the center coordinates of all the current candidate groups,
Figure BDA0003124044750000061
to average the current coordinates of all candidate group centers for elevation,
Figure BDA0003124044750000062
expressed as the azimuth of the central coordinate position of the current group to be discriminated,
Figure BDA0003124044750000063
elevation angle, theta, expressed as the central coordinate of the current group to be discriminated1Indicated as an azimuth angle threshold value, is,
Figure BDA0003124044750000064
denoted as the elevation threshold. Using the formula
Figure BDA0003124044750000065
And calculating the power P (q) of each primary coordinate in all the candidate groups, performing second-layer search on the first sound source, searching the position with the maximum output power in all the corresponding coordinates, and determining the position as the position of the first sound source.
And by adopting a mode of giving a lower power value to the small group, the small group near the sound source is removed, and the influence of the surrounding coordinates of the sound source on the positioning of a second sound source behind is reduced. Is provided with a region omega, and an omega inner coordinate in a spherical coordinate system
Figure BDA0003124044750000066
Satisfies the following conditions:
|θ-θs|≤θ2
Figure BDA0003124044750000067
wherein theta is2And
Figure BDA00031240447500000611
threshold values, theta, set in azimuth and elevation, respectivelysAnd
Figure BDA0003124044750000068
uniformly assigning a low power value E to the P (q) value of the group contained in the region omega for the azimuth angle and the elevation angle of the positioned first sound source coordinate respectivelylThe subgroup it contains is not considered anymore in the following steps. Wherein, if the current positioned sound source is the first positioned sound source, the first sound source is not subjected to the small nearbyInstead of the reduction and removal of the coordinate power of the group, the reduction and removal of the coordinate power of the subgroup in the vicinity of the sound source is performed from the second localized sound source.
The second sound source is localized according to the modified power distribution. Because it cannot be guaranteed that the region Ω includes all the subgroups near the first sound source that may affect the positioning of the second sound source, the screening method when positioning the first sound source is still adopted, that is, in the first-layer search, the subgroup with the largest power value is selected from the subgroups not included in the region Ω as the first candidate subgroup of the second sound source, and then the remaining subgroups in the lookup table are screened to reduce the possibility that the subgroups with the first sound source as the main power contribution sound source are mixed. Suppose there is u2The subgroup is screened as a candidate subgroup in the second sound source, and the v-th subgroup among the remaining subgroups is discriminated2The conditions of whether a group is a candidate group are:
Figure BDA0003124044750000069
Figure BDA00031240447500000610
Figure BDA0003124044750000071
Figure BDA0003124044750000072
Figure BDA00031240447500000710
Figure BDA0003124044750000073
wherein the content of the first and second substances,
Figure BDA0003124044750000074
denotes the v th2Center coordinates of the individual groups, wherein
Figure BDA0003124044750000075
Expressed as the b-th candidate group in the existing candidate group in the Cartesian coordinate system2Center coordinates of the individual groups, (X)c2,Yc2,Zc2) Expressed as the mean coordinate, theta, of the cartesian coordinate system after averaging the central coordinates of all the current candidate groupsc2To average the azimuth of the mean coordinate after averaging the center coordinates of all the current candidate groups,
Figure BDA0003124044750000076
to average the current coordinates of all candidate group centers for elevation,
Figure BDA0003124044750000077
expressed as the azimuth of the central coordinate position of the current group to be discriminated,
Figure BDA0003124044750000078
elevation angle, theta, expressed as the central coordinate of the current group to be discriminated1Indicated as an azimuth angle threshold value, is,
Figure BDA0003124044750000079
denoted as the elevation threshold.
When the candidate group u2When the number of the groups reaches a certain value n or all groups are judged, the screening is stopped. And calculating and sequencing the power P (q) values corresponding to all primary coordinates in the candidate group, and selecting the primary coordinate corresponding to the maximum power value as the coordinate of the second sound source.
It should be noted that the terms "upper", "lower", "left", "right", "front", "back", etc. used in the present invention are for clarity of description only, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not limited by the technical contents of the essential changes.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims (7)

1. A method for locating multiple sound sources suitable for use in a reverberant environment, comprising the steps of:
s1, collecting coordinates in the whole search area, grouping the coordinates, and calculating the center coordinate of each group;
s2, collecting voice signals by using a microphone array;
s3, determining a candidate group of a certain sound source position by adopting a double-layer search space clustering multi-sound source positioning algorithm in a mode of calculating the central coordinate power of each group, positioning the sound source position in all coordinates contained in the candidate group, and removing the coordinates near the sound source in a search area;
and repeating the operations until all the sound source positions are positioned.
2. The method of claim 1, wherein the grouping in step S1 is based on:
Figure FDA0003124044740000011
Figure FDA0003124044740000012
Figure FDA0003124044740000013
if the ith coordinate qiBelong to the jth group zjThen p (q)i∈zj) Has a value of 1; if the ith coordinate qiNot belonging to group j zj,p(qi∈zj) Is 0;
wherein I represents the total number of coordinates in the entire search area, J represents the number of current groups, and zjExpressed as all sets of coordinates in the jth group; wherein the initial value of J is 1, and sequentially adding 1 until the formula
Figure FDA0003124044740000014
It holds that as i and j change, p (q)i∈zj)、e(qi,zj) And zjThe central coordinates of the three-dimensional image are calculated through a K-mean algorithm;
in the formula, e (q)i,zj) Representing the bunching error, defined as the difference in acoustic path between all microphone pairs
Figure FDA0003124044740000015
Summing;
Figure FDA0003124044740000016
representing the distance position q between microphone k and microphone liThe value of the TDOA value of (a),
Figure FDA0003124044740000017
representing the set z of distances between microphone k and microphone ljTDOA value of center coordinates, M representing the number of microphones, θtDenoted as threshold.
3. The method as claimed in claim 2, wherein the threshold θ is set to be equal to or greater than a threshold θtIs defined as:
Figure FDA0003124044740000018
in the formulaλ represents wavelength, c is sound speed, f is sampling rate; in sound source localization, θtThe value is determined by the maximum frequency of the speech signal.
4. The method as claimed in claim 3, wherein the microphone array collects the speech signal by 16 microphones, and the microphone array is a cylinder and has 8 microphones uniformly distributed on the upper and lower outlines.
5. The method of claim 2, wherein the source location information is derived from the location information,
in the frequency domain, the formula for calculating the coordinate power is expressed as:
Figure FDA0003124044740000021
in the formula, Y (ω, q) is an output value of the coordinate position q at the frequency ω, and Xm(ω) Fourier transform, G, of the mth microphone signalm(ω) represents the frequency domain system function of the filter at the mth microphone;
based on a formula for calculating coordinate power in a frequency domain, obtaining a power output value P (q) of a coordinate position q as follows:
Figure FDA0003124044740000022
in the formula, Gl(ω) is expressed as a frequency domain system function of the filter at the l-th microphone, Xl(ω) is expressed as the fourier transform of the l-th microphone signal,
Figure FDA0003124044740000023
expressed as the conjugate of the frequency domain system function of the filter at the kth microphone,
Figure FDA0003124044740000024
expressed as the conjugate of the Fourier transform of the kth microphone signal, τk,qRepresenting the signal propagation time from coordinate position q to the kth microphone;
in the formula (I), the compound is shown in the specification,
Figure FDA0003124044740000025
Figure FDA0003124044740000026
a PHAT weighting coefficient between the ith microphone signal and the kth microphone signal;
after the power of the central coordinate of each group is calculated, a candidate group is determined, the position of a certain sound source is positioned in all coordinates contained in the candidate group, and the position of the sound source is determined as the maximum power value
Figure FDA0003124044740000027
The corresponding coordinates, namely:
Figure FDA0003124044740000028
6. the method of claim 5, wherein the determining a candidate group of sound sources is performed by: searching according to the result of the power value calculation of the central coordinate of each group, selecting the group corresponding to the maximum power value as a first candidate group, and when judging the vth group in the rest groups, selecting the group as the candidate group under the condition that:
Figure FDA0003124044740000029
Figure FDA00031240447400000210
Figure FDA0003124044740000031
vc|≤θ1
Figure FDA0003124044740000032
stopping judging the candidate group when the number u of the candidate group reaches a certain number or all the groups are judged;
in the formula (X)b,Yb,Zb) Expressed as the centre coordinates of the b-th of the existing candidate groups in a Cartesian coordinate system, (X)cc,Ycc,Zcc) Expressed as the mean coordinate, theta, of the cartesian coordinate system after averaging the central coordinates of all the current candidate groupscTo average the azimuth of the mean coordinate after averaging the center coordinates of all the current candidate groups,
Figure FDA0003124044740000033
elevation angle theta of average coordinate after averaging center coordinates of all current candidate groupsvExpressed as the azimuth of the central coordinate position of the current group to be discriminated,
Figure FDA0003124044740000034
elevation angle, theta, expressed as the central coordinate of the current group to be discriminated1Indicated as an azimuth angle threshold value, is,
Figure FDA0003124044740000035
denoted as the elevation threshold.
7. The method as claimed in claim 1, wherein the specific content of removing the coordinates near the sound source in the search area is:
a region omega is provided, and for the sub-groups contained in the region omega, the power of the coordinate position in the sub-group is uniformly reduced and a power value E is givenlMeanwhile, the small groups contained in the region omega are not considered in the subsequent step of positioning the positions of other sound sources;
in the spherical coordinate system the coordinates within the region omega (theta,
Figure FDA0003124044740000036
r) is required to satisfy:
|θ-θs|≤θ2
Figure FDA0003124044740000037
where theta is expressed as the azimuth angle of the coordinate,
Figure FDA0003124044740000038
elevation angle expressed as coordinates, r represents the distance of the coordinates from the origin of the coordinate system, thetasExpressed as the azimuth of the last source coordinate position of the currently located source,
Figure FDA0003124044740000039
elevation angle, theta, expressed as the last sound source coordinate position of the currently located sound source2Indicated as an azimuth angle threshold value, is,
Figure FDA00031240447400000310
denoted as the elevation threshold.
CN202110684270.9A 2021-06-21 2021-06-21 Multi-sound source positioning method suitable for reverberant environment Active CN113419216B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110684270.9A CN113419216B (en) 2021-06-21 2021-06-21 Multi-sound source positioning method suitable for reverberant environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110684270.9A CN113419216B (en) 2021-06-21 2021-06-21 Multi-sound source positioning method suitable for reverberant environment

Publications (2)

Publication Number Publication Date
CN113419216A true CN113419216A (en) 2021-09-21
CN113419216B CN113419216B (en) 2023-10-31

Family

ID=77789393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110684270.9A Active CN113419216B (en) 2021-06-21 2021-06-21 Multi-sound source positioning method suitable for reverberant environment

Country Status (1)

Country Link
CN (1) CN113419216B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115662383A (en) * 2022-12-22 2023-01-31 杭州爱华智能科技有限公司 Method and system for deleting main sound source, method, system and device for identifying multiple sound sources
CN117828405A (en) * 2024-02-23 2024-04-05 兰州交通大学 Signal positioning method based on intelligent frequency spectrum sensing

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106093864A (en) * 2016-06-03 2016-11-09 清华大学 A kind of microphone array sound source space real-time location method
US9554208B1 (en) * 2014-03-28 2017-01-24 Marvell International Ltd. Concurrent sound source localization of multiple speakers
CN106940439A (en) * 2017-03-01 2017-07-11 西安电子科技大学 K mean cluster weighting sound localization method based on wireless acoustic sensor network
CN108198568A (en) * 2017-12-26 2018-06-22 太原理工大学 A kind of method and system of more auditory localizations
CN110443371A (en) * 2019-06-25 2019-11-12 深圳欧克曼技术有限公司 A kind of artificial intelligence device and method
CN111352075A (en) * 2018-12-20 2020-06-30 中国科学院声学研究所 Underwater multi-sound-source positioning method and system based on deep learning
CN111474521A (en) * 2020-04-09 2020-07-31 南京理工大学 Sound source positioning method based on microphone array in multipath environment
CN111489753A (en) * 2020-06-24 2020-08-04 深圳市友杰智新科技有限公司 Anti-noise sound source positioning method and device and computer equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9554208B1 (en) * 2014-03-28 2017-01-24 Marvell International Ltd. Concurrent sound source localization of multiple speakers
CN106093864A (en) * 2016-06-03 2016-11-09 清华大学 A kind of microphone array sound source space real-time location method
CN106940439A (en) * 2017-03-01 2017-07-11 西安电子科技大学 K mean cluster weighting sound localization method based on wireless acoustic sensor network
CN108198568A (en) * 2017-12-26 2018-06-22 太原理工大学 A kind of method and system of more auditory localizations
CN111352075A (en) * 2018-12-20 2020-06-30 中国科学院声学研究所 Underwater multi-sound-source positioning method and system based on deep learning
CN110443371A (en) * 2019-06-25 2019-11-12 深圳欧克曼技术有限公司 A kind of artificial intelligence device and method
CN111474521A (en) * 2020-04-09 2020-07-31 南京理工大学 Sound source positioning method based on microphone array in multipath environment
CN111489753A (en) * 2020-06-24 2020-08-04 深圳市友杰智新科技有限公司 Anti-noise sound source positioning method and device and computer equipment

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
"MULTIPLE SOUND SOURCE LOCALIZATION BASED ON TDOA CLUSTERING AND MULTI-PATH MATCHING PURSUIT", 《IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP)》 *
"Multi-source sound localization using the competitive k-means clustering", 《2010 IEEE 15TH CONFERENCE ON EMERGING TECHNOLOGIES & FACTORY AUTOMATION (ETFA 2010)》 *
YOOK D: "Fast sound source localization using two-level search space clustering", 《IEEE TRANSACTIONS ON CYBERNETICS》, vol. 46, no. 1, pages 1 - 5 *
倪志莲;蔡卫平;张怡典;: "基于子带可控响应功率的多声源定位方法", 计算机工程与应用, no. 24 *
庄启雷;黄青华;: "基于三线交点球麦克风阵列的远场多声源定位", 上海大学学报(自然科学版), no. 02 *
滕鹏晓;杨亦春;李晓东;田静;: "基于多阵列数据融合的宽带多声源定位研究", 应用声学, no. 03 *
赵小燕;汤捷;周琳;吴镇扬;: "基于相位差复指数变换的传声器多声源定位", 东南大学学报(自然科学版), no. 02 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115662383A (en) * 2022-12-22 2023-01-31 杭州爱华智能科技有限公司 Method and system for deleting main sound source, method, system and device for identifying multiple sound sources
CN117828405A (en) * 2024-02-23 2024-04-05 兰州交通大学 Signal positioning method based on intelligent frequency spectrum sensing
CN117828405B (en) * 2024-02-23 2024-05-07 兰州交通大学 Signal positioning method based on intelligent frequency spectrum sensing

Also Published As

Publication number Publication date
CN113419216B (en) 2023-10-31

Similar Documents

Publication Publication Date Title
CN107102296B (en) Sound source positioning system based on distributed microphone array
US9837099B1 (en) Method and system for beam selection in microphone array beamformers
CN113419216A (en) Multi-sound-source positioning method suitable for reverberation environment
CN110068795A (en) A kind of indoor microphone array sound localization method based on convolutional neural networks
CN104142492A (en) SRP-PHAT multi-source spatial positioning method
CN112904279B (en) Sound source positioning method based on convolutional neural network and subband SRP-PHAT spatial spectrum
CN111429939B (en) Sound signal separation method of double sound sources and pickup
CN107167770A (en) A kind of microphone array sound source locating device under the conditions of reverberation
Di Carlo et al. Mirage: 2d source localization using microphone pair augmentation with echoes
CN109212481A (en) A method of auditory localization is carried out using microphone array
Alexandridis et al. Multiple sound source location estimation and counting in a wireless acoustic sensor network
CN109884591A (en) A kind of multi-rotor unmanned aerial vehicle acoustical signal Enhancement Method based on microphone array
CN105607042A (en) Method for locating sound source through microphone array time delay estimation
CN110610718A (en) Method and device for extracting expected sound source voice signal
US20130148814A1 (en) Audio acquisition systems and methods
EP2362238B1 (en) Estimating the distance from a sensor to a sound source
Rascon et al. Lightweight multi-DOA tracking of mobile speech sources
CN112363112B (en) Sound source positioning method and device based on linear microphone array
KR20090128221A (en) Method for sound source localization and system thereof
CN113514801A (en) Microphone array sound source positioning method and sound source identification method based on deep learning
Himawan et al. Clustering of ad-hoc microphone arrays for robust blind beamforming
Brutti et al. Speaker localization based on oriented global coherence field
CN110927668A (en) Sound source positioning optimization method of cube microphone array based on particle swarm
CN116008913A (en) Unmanned aerial vehicle detection positioning method based on STM32 and small microphone array
CN110441779B (en) Multi-sonobuoy distributed co-location method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant