CN102854494B - A kind of sound localization method and device - Google Patents

A kind of sound localization method and device Download PDF

Info

Publication number
CN102854494B
CN102854494B CN201210281019.9A CN201210281019A CN102854494B CN 102854494 B CN102854494 B CN 102854494B CN 201210281019 A CN201210281019 A CN 201210281019A CN 102854494 B CN102854494 B CN 102854494B
Authority
CN
China
Prior art keywords
sound
signal
source signal
function
way
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210281019.9A
Other languages
Chinese (zh)
Other versions
CN102854494A (en
Inventor
彭迎标
邵诗强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TCL Corp
Original Assignee
TCL Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TCL Corp filed Critical TCL Corp
Priority to CN201210281019.9A priority Critical patent/CN102854494B/en
Publication of CN102854494A publication Critical patent/CN102854494A/en
Application granted granted Critical
Publication of CN102854494B publication Critical patent/CN102854494B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Circuit For Audible Band Transducer (AREA)

Abstract

The present invention is applicable to sound processing techniques field, provides a kind of sound localization method and device, and described method comprises: microphone array gathers sound-source signal, and carries out pre-service to the sound-source signal of wherein any two microphone collections; Determine the cross-spectral density function through described two-way sound-source signal; Determine with current signal change and the weighting function of adjustment; Determine the value sequence of the cross correlation function of described two-way sound-source signal according to described cross-spectral density function and weighting function, and arrive the time delay of described two microphones according to the maximal value determination sound-source signal of described cross correlation function; The time delay of wherein any two microphones is arrived, localization of sound source position according to the arranged distribution of microphone array and described sound-source signal.The weighting function that the present invention adopts can make corresponding adjustment with the change of current signal, under making the environment changed in sound source signal to noise ratio (S/N ratio), by corresponding adjustment weighting function, also can the time delay of Obtaining Accurate sound-source signal, and improve auditory localization precision.

Description

A kind of sound localization method and device
Technical field
The invention belongs to sound processing techniques field, particularly relate to a kind of sound localization method and device.
Background technology
In video conference, security protection or usually need to position sound source in some industrial application, but in some scenarios, due to the uncertainty of outside sound source environment, voice signal is subject to outside noise interference, signal to noise ratio (S/N ratio) is changed, in existing auditory localization technology, one group of voice data is obtained by microphone array, time delay estimation is carried out with phase tranformation broad sense cross-correlation method (PHAT-GCC) again after pre-service, according to the arranged distribution of microphone in time delay result and microphone array, the position of sound source can be determined by geometric model.Due in existing PHAT-GCC method, because the signal to noise ratio (S/N ratio) of sound-source signal may change with environment, when signal energy is less, the denominator carrying out the weighting function of frequency domain weighting can go to zero, the value of weighting function is made to become very large, the time delay resultant error of such acquisition is also comparatively large, and the sound source position finally oriented also can exist very big error.
In prior art, the conventional weight function of frequency domain is in practical application, this weighting function cannot resist larger noise and reverberation impact, and when speech signal energy is less, weighting function denominator close to zero, thus produces comparatively big error.And in embodiments of the present invention, will such as formula the weighting function shown in (10) or formula (11), be associated with current signal, wherein ρ is the regulatory factor proportional with current signal SNR (λ), the value of ρ is drawn by the many experiments test at sound source environment, this value relies on current signal SNR (λ), different SNR (λ), ρ gets different values, SNR (λ) is higher, the value of ρ is larger, as a kind of concrete value mode, as SNR (λ)≤10dB, the span of ρ is 0.3≤ρ≤0.55, as 10dB < SNR (λ)≤25dB, the span of ρ is 0.55 < ρ≤0.75, as 25dB < SNR (λ), the span of ρ is 0.75 < ρ≤0.85.
For formula (10), if current signal is smaller, namely the energy comparison of voice signal is little, now φ 12w () is smaller, if ρ gets 0.5, and therefore weighting function functional value compared with existing weighting function, weighted value is much smaller, can reduce error to a certain extent; For formula (11), further contemplate additive noise, in the denominator term of weighting function, also comprise coherence function shown in the size of related function and the signal value size of voice signal have nothing to do, the functional value further ensuring weighting function can not big ups and downs, reduce error.
Step S209, the product of described cross-spectral density function and weighting function is obtained the value sequence of the cross correlation function of described two-way sound-source signal through inverse Fourier transform;
Step S210, peakvalue's checking is carried out to the value sequence of described cross correlation function, obtain the sample point that maximum of points is corresponding, and determine that described sound-source signal arrives the time delay of described two microphones interval time according to sample point.
Above-mentioned steps S209-S210 is that in embodiment one, step S104 mono-kind is concrete preferred embodiment.
In step S209, to the cross-spectral density function R in formula (7) 12(λ, k) and the weighting function in formula (10) or formula (11) product carry out inverse Fourier transform, obtain the cross correlation function of described two-way voice signal:
In step S210, to described cross correlation function r 12(λ, n) carries out peakvalue's checking, gets the sample point wherein corresponding to maximum discrete value, is multiplied interval time by the described sample point obtained with sample point, can obtain the time delay of described two-way sound-source signal.
Step S211, arrive the time delay of wherein said two microphones according to the arranged distribution of microphone array and described sound-source signal, localization of sound source position, this flow process terminates, and enters next frame process.
After obtaining time delay value, then sound source particular location can be determined according to the aggregation model of microphone position in microphone array.
In step S211 embodiment one, step S105 mono-kind is concrete preferred embodiment.
The embodiment of the present invention lists concrete preferred implementation step to step wherein on the basis of embodiment one, can realize sound source and accurately locate.
Summary of the invention
In view of the above problems, the object of the present invention is to provide a kind of sound localization method, be intended to solve in existing auditory localization technology due to the signal to noise ratio (S/N ratio) of sound-source signal change time, the value of weighting function may become very large, makes the technical matters that auditory localization resultant error is very large.
The present invention is achieved in that a kind of sound localization method, comprises the steps:
Microphone array gathers sound-source signal, and carries out pre-service to the sound-source signal of wherein any two microphone collections;
Determine the cross-spectral density function through described pretreated two-way sound-source signal;
Determine with current signal change and the weighting function of adjustment;
Determine the value sequence of the cross correlation function of described two-way sound-source signal according to described cross-spectral density function and weighting function, and arrive the time delay of described two microphones according to the maximal value determination sound-source signal of described cross correlation function;
According to the arranged distribution of microphone array and the time delay of wherein said two microphones of described sound-source signal arrival, localization of sound source position.
Another object of the present invention is to provide a kind of sound source locating device, comprising:
Microphone array gathers pretreatment unit, gathers sound-source signal, and carry out pre-service to the sound-source signal of wherein any two microphone collections for microphone array;
Cross-spectral density determining unit, for determining the cross-spectral density function through described pretreated two-way sound-source signal;
Weighting function determining unit, the weighting function of adjustment for determining with current signal change;
Time delay determining unit, for determining the value sequence of the cross correlation function of described two-way sound-source signal according to described cross-spectral density function and weighting function, and arrives the time delay of described two microphones according to the maximal value determination sound-source signal of described cross correlation function;
Auditory localization unit, for arriving the time delay of wherein said two microphones, localization of sound source position according to the arranged distribution of microphone array and described sound-source signal.
The invention has the beneficial effects as follows: due to sound localization method provided by the invention and device adopt weighting function can make corresponding adjustment with the change of current signal, make the impact due to the factor such as ground unrest, reverberation, under the environment that sound source signal to noise ratio (S/N ratio) changes, by corresponding adjustment weighting function, also can the time delay of Obtaining Accurate voice signal, improve auditory localization precision.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the sound localization method that first embodiment of the invention provides;
Fig. 2 is the process flow diagram of the sound localization method that second embodiment of the invention provides;
Fig. 3 is the block diagram of the sound source locating device that third embodiment of the invention provides;
Fig. 4 is the block diagram of the sound source locating device that fourth embodiment of the invention provides.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.
In order to technical solutions according to the invention are described, be described below by specific embodiment.
embodiment one:
Fig. 1 shows the flow process of the sound localization method that first embodiment of the invention provides, and illustrate only the part relevant to the embodiment of the present invention for convenience of explanation.
The sound localization method that the embodiment of the present invention provides comprises:
Step S101, microphone array gather sound-source signal, and carry out pre-service to the sound-source signal of wherein any two microphone collections.
Microphone array is the microphone set that multiple microphone arranges according to certain way, sound-source signal collection is usually used in auditory localization technology, one group of sound-source signal can be obtained, in this step, appoint and get the sound-source signal that two microphones wherein collect and carry out pre-service, comprise filtering and framing etc.
Step S102, the cross-spectral density function determining through described pretreated two-way sound-source signal;
Step S103, determine with current signal change and adjustment weighting function;
Step S104, determine the value sequence of the cross correlation function of described two-way sound-source signal according to described cross-spectral density function and weighting function, and arrive the time delay (mistiming) of described two microphones according to the maximal value determination sound-source signal of described cross correlation function.
Step S102-S104 provides and determines that sound-source signal arrives the process of the time delay of two microphones, determine that the degree of accuracy of time delay determines the degree of accuracy of auditory localization, general time delay defining method is: cross-spectral density function and the weighting function of first determining two-way sound-source signal, carry out according to the product of described cross-spectral density function and weighting function the value sequence that inverse fourier transform obtains the cross correlation function of two paths of signals again, determine described time delay according to the maximal value of described cross correlation function.But existing weighting function cannot be followed the change of current signal and change, this weighting function cannot resist larger ground unrest and reverberation, and when voice signal ability is less, the value of described weighting function is very large, follow-up postponing a meeting or conference when determining produces very big error.And in embodiments of the present invention, the determined weighting function of step S103 makes corresponding adjustment with current signal change, make the functional value of weighting function can not less because of speech signal energy time and become very large, and then ensure the degree of accuracy of fixed response time really.
Step S105, arrive the time delay of wherein said two microphones according to the arranged distribution of microphone array and described sound-source signal, localization of sound source position.
The principle of auditory localization technology is by determining that sound-source signal arrives the time delay of two microphones, and according to the particular location of described microphone, by geometric model determination sound source particular location, the present embodiment determine higher accuracy time delay, the accurate localization of sound source position of geometric analysis method can be passed through, concrete localization method is identical with existing auditory localization technology, repeats no more herein.
The key distinction of the embodiment of the present invention and existing auditory localization technology is, the weighting function that the present embodiment provides corresponding adjustment with current signal change, make the functional value of weighting function violent change can not occur because current signal changes, the degree of accuracy of the last like this time delay value determined is guaranteed, and then improves auditory localization degree of accuracy.
embodiment two:
Fig. 2 shows the flow process of the sound localization method that second embodiment of the invention provides, and illustrate only the part relevant to the embodiment of the present invention for convenience of explanation.
The sound localization method that the embodiment of the present invention provides comprises:
Step S201, microphone array gather sound-source signal;
Step S202, bandpass filtering is carried out to the sound-source signal of two microphone collections any in described microphone array, obtain the sound-source signal after two-way bandpass filtering;
Step S203, windowing sub-frame processing is carried out to the sound-source signal after described two-way bandpass filtering, obtain two-way short-term stationarity signal.
Above-mentioned steps S201-S03 is concrete preferred embodiment as the one of step S101 in embodiment one.
In step s 201, suppose that the sound-source signal that described two microphones collect is respectively:
x 1(t)=a 1s 1(t)+n 1(t) (1)
x 2(t)=a 2s 1(t+D)+n 2(t) (2)
Wherein, a 1, a 2for the sound attenuating factor, be near-field signals owing to being sound source, can think a 1, a 2to be 1, D be, and sound-source signal arrives the time delay of described two microphones, n 1(t), n 2t noise signal that () arrives for described two microphones.
In step S202, bandpass filtering is carried out to the sound-source signal that microphone collects, by the noise filtering of low-frequency range and high band, for subsequent treatment provides the sound-source signal after two-way bandpass filtering.
In step S203, as a kind of implementation, use Hamming window function to carry out framing to the sound-source signal after described two-way bandpass filtering, obtain two-way short-term stationarity signal, the method that windowing framing generally adopts frame overlapping with frame.Two-way short-term stationarity signal is:
s 1(λ,n)=x 1(n+d(λ-1)N)w(n) (3)
s 2(λ,n)=x 2(n+d(λ-1)N)w(n) (4)
Wherein w (n) is Hamming window function, and N is the length of window function w (n), and d is the shift parameters between consecutive frame, and λ is frame number.
Step S204, to be judged by end-point detection described in two-way short-term stationarity signal whether be voice signal; Perform step 205; No execution step 207.
Step S205, determine current signal, current signal is: SNR (λ)=aSNR (λ-1)+(1-a) SNR_0, wherein SNR (λ-1) is previous frame sound-source signal signal to noise ratio (S/N ratio), SNR_0 is the prior weight using the energy ratio of current speech signal frame and last non-speech audio frame to try to achieve, and a is smoothing factor;
Step S206, Fast Fourier Transform (FFT) is carried out to described two-way short-term stationarity signal, then determine the cross-spectral density function of described two-way short-term stationarity signal;
Step S207, give up described two-way short-term stationarity signal, upgrade signal to noise ratio snr (λ)=SNR (λ-1), wherein SNR (λ-1) is previous frame sound-source signal signal to noise ratio (S/N ratio), and this flow process terminates, and enters next frame process;
Above-mentioned steps S204-S207 is that in embodiment one, step S102 mono-kind is concrete preferred embodiment.Judge whether two-way short-term stationarity signal is voice signal by end-point detection in step S204, in the present embodiment, the sound-source signal that microphone collects comprises voice signal and the ambient noise signal of sound source, if described sound source not sounding time, the sound-source signal that described microphone collects is only ambient noise signal, concrete, when the short-time energy (energy of a sound signal short time period) and short-time zero-crossing rate (in the unit interval, signal waveform is through the number of times of transverse axis (zero level)) that detect described two-way short-term stationarity signal are all greater than corresponding threshold value, can judge that current sound-source signal is as voice signal.
When voice signal λ frame after determining framing is non-speech audio, then current signal
SNR(λ)=SNR(λ-1) (8)
When voice signal λ frame after determining framing is voice signal, then current signal
SNR(λ)=aSNR(λ-1)+(1-a)SNR_0 (9)
Wherein, the signal to noise ratio (S/N ratio) that SNR (λ-1) is previous frame, SNR_0 is the energy ratio of current speech signal frame and last non-speech audio frame, and a is smoothing factor.
When determining that current sound-source signal is voice signal, Fast Fourier Transform (FFT) is carried out to described two-way short-term stationarity signal, then determine the cross-spectral density function of described two-way short-term stationarity signal.Concrete, Fast Fourier Transform (FFT) is carried out to the two-way voice signal in formula (3) and formula (4), has
S 1 ( &lambda; , k ) = &Sigma; n = 0 N - 1 s 1 ( &lambda; , n ) exp ( - j 2 &pi; N nk ) - - - ( 5 )
S 2 ( &lambda; , k ) = &Sigma; n = 0 N - 1 s 2 ( &lambda; , n ) exp ( - j 2 &pi; N nk ) - - - ( 6 )
Therefore, can be in the hope of the cross-spectral density function of described two-way voice signal:
R 12 ( &lambda; , k ) = S 1 ( &lambda; , k ) S 2 * ( &lambda; , k ) - - - ( 7 )
Wherein, s 1(λ, n) and s 2(λ, n) is the finite length sequence of N for length, after Fourier transform, obtain S 1(λ, k) and S 2(λ, k), for S 2the conjugate function of (λ, k).
When determining that current two-way short-term stationarity signal is non-speech audio, give up described two-way short-term stationarity signal.When detecting that described two-way short-term stationarity is non-speech audio, now there is no need to carry out follow-up computing again, therefore giving up described two-way short-term stationarity in step S207, just decrease calculated amount to a certain extent like this.
Step S208, according to described current signal determination weighting function or wherein φ 12w cross-spectral density function that () is sound-source signal, ρ is the regulatory factor proportional with current signal SNR (λ), for coherence function, wherein φ 1(w) and φ 2w () is the autocorrelation function of described two-way voice signal.
First above-mentioned steps S208 is that in embodiment one, step S103 mono-kind is concrete preferred embodiment needs to determine signal to noise ratio (S/N ratio), then according to described signal to noise ratio (S/N ratio) determination weighting function.
After determining current signal, then determine weighting function corresponding with it.In step S208, if do not consider the additive noise in actual environment, weighting function described in the present embodiment is:
If consideration additive noise, weighting function described in the present embodiment is:
Wherein, φ 12w cross-spectral density function that () is voice signal, ρ is the regulatory factor proportional with current signal SNR (k), for coherence function, wherein φ 1(w) and φ 2w () is the autocorrelation function of described two-way voice signal.
embodiment three:
Fig. 3 shows the structure of the sound source locating device that third embodiment of the invention provides, and illustrate only the part relevant to the embodiment of the present invention for convenience of explanation.
The sound source locating device that the embodiment of the present invention provides comprises:
Microphone array gathers pretreatment unit 301, gathers sound-source signal, and carry out pre-service to the sound-source signal of wherein any two microphone collections for microphone array;
Cross-spectral density determining unit 302, for determining the cross-spectral density function through described pretreated two-way sound-source signal;
Weighting function determining unit 303, the weighting function of adjustment for determining with current signal change;
Time delay determining unit 304, for determining the value sequence of the cross correlation function of described two-way sound-source signal according to described cross-spectral density function and weighting function, and arrive the time delay of described two microphones according to the maximal value determination sound-source signal of described cross correlation function;
Auditory localization unit 305, for arriving the time delay of wherein said two microphones, localization of sound source position according to the arranged distribution of microphone array and described sound-source signal.
The functional unit 301-305 that the present embodiment provides respectively correspondence achieves step S101-S105 in embodiment one, wherein, microphone array gathers pretreatment unit 301 and gathers sound-source signal and to after two-way sound-source signal pre-service wherein, cross-spectral density determining unit 302 and weighting function determining unit 303 determine cross-spectral density function and weighting function respectively, described weighting function can make corresponding adjustment with current signal change, make the value of weighting function can not acute variation, time delay determining unit 304 according to described cross-spectral density function and weighting function determination sound-source signal arrive described two microphones time delay, auditory localization unit 305 again can localization of sound source position according to the arranged distribution of microphone array and described time delay.In the sound source locating device that example of the present invention provides, the determined weighting function of weighting function determining unit 303 is followed the change of current signal and changes, and this makes the degree of accuracy of the time delay result obtained higher, thus can improve auditory localization degree of accuracy.
embodiment four:
Fig. 4 shows the structure of the sound source locating device that fourth embodiment of the invention provides, and illustrate only the part relevant to the embodiment of the present invention for convenience of explanation.
The sound source locating device that the embodiment of the present invention provides comprises:
Microphone array gathers pretreatment unit 401, gathers sound-source signal, and carry out pre-service to the sound-source signal of wherein any two microphone collections for microphone array;
Cross-spectral density determining unit 402, for determining the cross-spectral density function through described pretreated two-way sound-source signal;
Weighting function determining unit 403, the weighting function of adjustment for determining with current signal change;
Time delay determining unit 404, for determining the value sequence of the cross correlation function of described two-way sound-source signal according to described cross-spectral density function and weighting function, and arrive the time delay of described two microphones according to the maximal value determination source sound of described cross correlation function;
Auditory localization unit 405, for arriving the time delay of wherein said two microphones, localization of sound source position according to the arranged distribution of microphone array and described sound-source signal.
Wherein, described microphone array collection pretreatment unit 401 comprises:
Microphone array acquisition module 4011, gathers sound-source signal for microphone array;
Bandpass filtering modules block 4012, for carrying out bandpass filtering, the sound-source signal after two-way bandpass filtering to the sound-source signal of two microphone collections any in described microphone array;
Sub-frame processing module 4013, for carrying out windowing sub-frame processing to described two-way through the sound-source signal of bandpass filtering, obtains two-way short-term stationarity signal.
Wherein, described cross-spectral density determining unit 402 comprises:
Phonetic decision module 4021, for judging by end-point detection whether described pretreated present frame sound-source signal is voice signal;
Current signal determination module 4022, for when judgement is, determine current signal, described current signal is: SNR (λ)=aSNR (λ-1)+(1-a) SNR_0, wherein SNR (λ-1) is previous frame sound-source signal signal to noise ratio (S/N ratio), SNR_0 is the prior weight using the energy ratio of current speech signal frame and last non-speech audio frame to try to achieve, and a is smoothing factor;
Cross-spectral density determination module 4023, carries out Fast Fourier Transform (FFT) to described two-way short-term stationarity signal, then determines the cross-spectral density function of described two-way short-term stationarity signal;
Signal gives up module 4024, for when judging no, gives up described two-way short-term stationarity signal, and upgrades signal to noise ratio snr (λ)=SNR (λ-1), and wherein SNR (λ-1) is previous frame sound-source signal signal to noise ratio (S/N ratio).
Wherein, weighting function determining unit 403 comprises:
Weighting function determination module 4031, for according to described current signal determination weighting function being or wherein φ 12w cross-spectral density function that () is sound-source signal, ρ is the regulatory factor proportional with current signal SNR (λ-1), for coherence function, wherein φ 1(w) and φ 2w () is the autocorrelation function of described two-way sound-source signal.
Wherein, described time delay determining unit 404 comprises:
Cross correlation function acquisition module 4041, for obtaining the value sequence of the cross correlation function of described two-way sound-source signal through inverse Fourier transform by the product of described cross-spectral density function and weighting function;
Time delay determination module 4042, for carrying out peakvalue's checking to the value sequence of described cross correlation function, obtains the sample point that maximum of points is corresponding, and determines that described sound-source signal arrives the time delay of described two microphones interval time according to sample point.
The embodiment of the present invention is on the basis of example three, give the concrete preferred structure of wherein functional unit, correspondence realizes each step in embodiment two, concrete, after microphone array acquisition module 4011 collects sound-source signal, again by bandpass filtering modules block 4012 and sub-frame processing module 4013 to wherein arbitrarily two-way sound-source signal carry out pre-service, when phonetic decision module 4021 detects that current sound-source signal is voice signal, current signal determination module 4022 determines current signal, and by cross-spectral density determination module 4023, Fourier transform is carried out to described two-way short-term stationarity signal, determine the cross-spectral density function of described two-way short-term stationarity signal again, otherwise give up module 4024 by signal and give up described two-way short-term stationarity signal, and upgrade signal to noise ratio (S/N ratio), unnecessary calculation procedure can be saved like this.After current signal is determined, then by weighting function determination module 4031 according to described current signal determination weighting function, as a kind of implementation, described weighting function or wherein φ 12w cross-spectral density function that () is sound-source signal, ρ is the regulatory factor proportional with current signal SNR (λ-1), for coherence function, wherein φ 1(w) and φ 2w () is the autocorrelation function of described two-way sound-source signal, wherein the value of ρ is drawn by the many experiments test at sound source environment, value can refer to embodiment two, this value relies on current signal SNR (λ), different SNR (λ), ρ gets different values, SNR (λ) is higher, the value of ρ is larger, when SNR (λ) diminishes, ρ value is followed and is diminished, and the functional value of the weighting function therefore in the embodiment of the present invention acute variation can not occur.The product of described cross-spectral density function and weighting function is obtained the cross correlation function of described two-way sound-source signal by cross correlation function acquisition module 4041 through inverse Fourier transform, time delay determination module 4042 carries out peakvalue's checking to the value sequence of described cross correlation function, obtain the sample point that maximum of points is corresponding, and described sample point is multiplied by sampling interval time, the time delay that described sound-source signal arrives described two microphones can be obtained, after time delay is determined, auditory localization unit 405 can accurately navigate to the position of sound source with the arranged distribution of microphone array accordingly.
The present embodiment correspondence achieves each step in embodiment two, provides concrete cross-spectral density function and weighting function determination mode, can realize sound source and accurately locate.
To sum up, the sound localization method that the embodiment of the present invention provides and device, compared with existing auditory localization technology, can improve auditory localization precision.
One of ordinary skill in the art will appreciate that, the all or part of step realized in above-described embodiment method is that the hardware that can carry out instruction relevant by program has come, described program can be stored in a computer read/write memory medium, described storage medium, as ROM/RAM, disk, CD etc.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any amendments done within the spirit and principles in the present invention, equivalent replacement and improvement etc., all should be included within protection scope of the present invention.

Claims (7)

1. a sound localization method, is characterized in that, described method comprises:
Microphone array gathers sound-source signal, and carries out pre-service to the sound-source signal of wherein any two microphone collections;
Determine the cross-spectral density function through described pretreated two-way sound-source signal;
Determine with current signal change and the weighting function of adjustment;
Determine the value sequence of the cross correlation function of described two-way sound-source signal according to described cross-spectral density function and weighting function, and arrive the time delay of described two microphones according to the maximal value determination sound-source signal of described cross correlation function; The value sequence of the cross correlation function of described two-way sound-source signal is by obtaining the product of described cross-spectral density function and weighting function through inverse Fourier transform;
According to the arranged distribution of microphone array and the time delay of wherein said two microphones of described sound-source signal arrival, localization of sound source position;
Described microphone array gathers sound-source signal, and carries out pre-treatment step to the sound-source signal of wherein any two microphone collections, specifically comprises:
Microphone array gathers sound-source signal;
Bandpass filtering is carried out to the sound-source signal of two microphone collections any in described microphone array, obtains the sound-source signal after two-way bandpass filtering;
Through the sound-source signal of bandpass filtering, windowing sub-frame processing is carried out to described two-way, obtains two-way short-term stationarity signal; Describedly determine, through the cross-spectral density function step of described pretreated two-way sound-source signal, specifically to comprise:
Judge whether described two-way short-term stationarity signal is voice signal by end-point detection;
When judging to be, determine current signal, described current signal is: SNR (λ)=aSNR (λ-1)+(1-a) SNR_0, wherein SNR (λ-1) is previous frame sound-source signal signal to noise ratio (S/N ratio), SNR_0 is the prior weight using the energy ratio of current speech signal frame and last non-speech audio frame to try to achieve, and a is smoothing factor;
Fast Fourier Transform (FFT) is carried out to described two-way short-term stationarity signal, then determines the cross-spectral density function of described two-way short-term stationarity signal, wherein, can be in the hope of the cross-spectral density function of described two-way voice signal: s 1(λ, n) and s 2(λ, n) is the finite length sequence of N for length, after Fourier transform, obtain S 1(λ, k) and S 2(λ, k), for S 2the conjugate function of (λ, k);
When judging no, give up described two-way short-term stationarity signal, and upgrade signal to noise ratio snr (λ)=SNR (λ-1), wherein SNR (λ-1) is previous frame sound-source signal signal to noise ratio (S/N ratio).
2. method as claimed in claim 1, is characterized in that, when the short-time energy of described sound-source signal and short-time zero-crossing rate are all greater than corresponding threshold value, can judge that current sound-source signal is as voice signal.
3. method as claimed in claim 2, is characterized in that, described determination with current signal change and the weighting function step of adjustment, specifically comprises:
According to described current signal determination weighting function or wherein φ 12w cross-spectral density function that () is sound-source signal, ρ is the regulatory factor proportional with current signal SNR (λ), for coherence function, wherein φ 1(w) and φ 2w () is the autocorrelation function of described two-way sound-source signal.
4. method as claimed in claim 3, it is characterized in that, the described value sequence determining the cross correlation function of described two-way sound-source signal according to described cross-spectral density function and weighting function, and the time delay step of described two microphones is arrived according to the maximal value determination sound-source signal of described cross correlation function, specifically comprise:
The product of described cross-spectral density function and weighting function is obtained the value sequence of the cross correlation function of described two-way sound-source signal through inverse Fourier transform;
Peakvalue's checking is carried out to the value sequence of described cross correlation function, obtains the sample point that maximum of points is corresponding, and determine that described sound-source signal arrives the time delay of described two microphones interval time according to sample point.
5. a sound source locating device, is characterized in that, described device comprises:
Microphone array gathers pretreatment unit, gathers sound-source signal, and carry out pre-service to the sound-source signal of wherein any two microphone collections for microphone array;
Cross-spectral density determining unit, for determining the cross-spectral density function through described pretreated two-way sound-source signal;
Weighting function determining unit, the weighting function of adjustment for determining with current signal change;
Time delay determining unit, for determining the value sequence of the cross correlation function of described two-way sound-source signal according to described cross-spectral density function and weighting function, and arrives the time delay of described two microphones according to the maximal value determination sound-source signal of described cross correlation function; The value sequence of the cross correlation function of described two-way sound-source signal is by obtaining the product of described cross-spectral density function and weighting function through inverse Fourier transform;
Auditory localization unit, for arriving the time delay of wherein said two microphones, localization of sound source position according to the arranged distribution of microphone array and described sound-source signal;
Described microphone array gathers pretreatment unit and comprises:
Microphone array acquisition module, gathers sound-source signal for microphone array;
Bandpass filtering modules block, for carrying out bandpass filtering, the sound-source signal after two-way bandpass filtering to the sound-source signal of two microphone collections any in described microphone array;
Sub-frame processing module, for carrying out windowing sub-frame processing to described two-way through the sound-source signal of bandpass filtering, obtains two-way short-term stationarity signal;
Described cross-spectral density determining unit comprises:
Phonetic decision module, for judging by end-point detection whether described pretreated present frame sound-source signal is voice signal;
Current signal determination module, for when judgement is, determine current signal, described current signal is: SNR (λ)=aSNR (λ-1)+(1-a) SNR_0, wherein SNR (λ-1) is previous frame sound-source signal signal to noise ratio (S/N ratio), SNR_0 is the prior weight using the energy ratio of current speech signal frame and last non-speech audio frame to try to achieve, and a is smoothing factor;
Cross-spectral density determination module, carries out Fast Fourier Transform (FFT) to described two-way short-term stationarity signal, then determines the cross-spectral density function of described two-way short-term stationarity signal, wherein, can be in the hope of the cross-spectral density function of described two-way voice signal: s 1(λ, n) and s 2(λ, n) is the finite length sequence of N for length, after Fourier transform, obtain S 1(λ, k) and S 2(λ, k), for S 2the conjugate function of (λ, k);
Signal gives up module, for when judging no, gives up described two-way short-term stationarity signal, and upgrades signal to noise ratio snr (λ)=SNR (λ-1), and wherein SNR (λ-1) is previous frame sound-source signal signal to noise ratio (S/N ratio).
6. device as claimed in claim 5, it is characterized in that, described weighting function determining unit comprises:
Weighting function determination module, for according to described current signal determination weighting function being or wherein φ 12w cross-spectral density function that () is sound-source signal, ρ is the regulatory factor proportional with current signal SNR (λ-1), for coherence function, wherein φ 1(w) and φ 2w () is the autocorrelation function of described two-way voice signal.
7. device as claimed in claim 6, it is characterized in that, described time delay determining unit comprises:
Cross correlation function acquisition module, for obtaining the value sequence of the cross correlation function of described two-way sound-source signal through inverse Fourier transform by the product of described cross-spectral density function and weighting function;
Time delay determination module, for carrying out peakvalue's checking to described cross correlation function, obtains the sample point that maximum of points is corresponding, and determines that described sound-source signal arrives the time delay of described two microphones interval time according to sample point.
CN201210281019.9A 2012-08-08 2012-08-08 A kind of sound localization method and device Expired - Fee Related CN102854494B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210281019.9A CN102854494B (en) 2012-08-08 2012-08-08 A kind of sound localization method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210281019.9A CN102854494B (en) 2012-08-08 2012-08-08 A kind of sound localization method and device

Publications (2)

Publication Number Publication Date
CN102854494A CN102854494A (en) 2013-01-02
CN102854494B true CN102854494B (en) 2015-09-09

Family

ID=47401242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210281019.9A Expired - Fee Related CN102854494B (en) 2012-08-08 2012-08-08 A kind of sound localization method and device

Country Status (1)

Country Link
CN (1) CN102854494B (en)

Families Citing this family (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103364761B (en) * 2013-07-12 2015-02-11 哈尔滨工业大学 Method for positioning system to position indoor sound source
CN104422922A (en) * 2013-08-19 2015-03-18 中兴通讯股份有限公司 Method and device for realizing sound source localization by utilizing mobile terminal
CN103630148B (en) * 2013-11-01 2016-03-02 中国科学院物理研究所 Sample of signal averaging device and sample of signal averaging method
CN104535965A (en) * 2014-12-29 2015-04-22 江苏科技大学 Parallelized sound source positioning system based on embedded GPU system and method
CN104700842B (en) * 2015-02-13 2018-05-08 广州市百果园信息技术有限公司 The delay time estimation method and device of voice signal
CN106162431A (en) * 2015-04-02 2016-11-23 钰太芯微电子科技(上海)有限公司 The beam positioning system of giant-screen mobile terminal
US9769563B2 (en) * 2015-07-22 2017-09-19 Harman International Industries, Incorporated Audio enhancement via opportunistic use of microphones
CN106488358B (en) * 2015-09-09 2019-07-19 上海其高电子科技有限公司 Optimize sound field imaging localization method and system
CN105467364B (en) * 2015-11-20 2019-03-29 百度在线网络技术(北京)有限公司 A kind of method and apparatus positioning target sound source
CN105575387A (en) * 2015-12-25 2016-05-11 重庆邮电大学 Sound source localization method based on acoustic bionic cochlea basal membrane
CN106970356A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 Auditory localization tracking under a kind of complex environment
CN106296854A (en) * 2016-08-12 2017-01-04 上海电机学院 A kind of classroom based on microphone array roll calling system
CN108269581B (en) * 2017-01-04 2021-06-08 中国科学院声学研究所 Double-microphone time delay difference estimation method based on frequency domain coherent function
CN107202385B (en) * 2017-06-22 2020-08-25 广东美的制冷设备有限公司 Sound wave mosquito repelling method and device and air conditioner
CN107271963A (en) * 2017-06-22 2017-10-20 广东美的制冷设备有限公司 The method and apparatus and air conditioner of auditory localization
CN107202976B (en) * 2017-05-15 2020-08-14 大连理工大学 Low-complexity distributed microphone array sound source positioning system
CN107159435B (en) * 2017-05-25 2019-07-09 洛阳语音云创新研究院 Method and device for adjusting working state of mill
CN107290721B (en) * 2017-06-01 2019-01-11 深圳大学 A kind of indoor localization method and system
CN107199572B (en) * 2017-06-16 2020-02-14 山东大学 Robot system and method based on intelligent sound source positioning and voice control
CN107329114A (en) * 2017-06-21 2017-11-07 歌尔股份有限公司 Sound localization method and device
CN107144820A (en) * 2017-06-21 2017-09-08 歌尔股份有限公司 Sound localization method and device
CN107894595A (en) * 2017-11-06 2018-04-10 上海航天测控通信研究所 A kind of delay time estimation method under non-gaussian SaS impulsive noise environments
CN109778485B (en) * 2017-11-10 2022-08-05 青岛海尔洗涤电器有限公司 Control method and system for clothes processing device
CN108152788A (en) * 2017-12-22 2018-06-12 西安Tcl软件开发有限公司 Sound-source follow-up method, sound-source follow-up equipment and computer readable storage medium
CN108198568B (en) * 2017-12-26 2020-10-16 太原理工大学 Method and system for positioning multiple sound sources
CN108332063B (en) * 2018-01-29 2020-04-24 中国科学院声学研究所 Pipeline leakage positioning method based on cross correlation
CN108549052B (en) * 2018-03-20 2021-04-13 南京航空航天大学 Time-frequency-space domain combined weighted circular harmonic domain pseudo-sound strong sound source positioning method
CN110310651B (en) * 2018-03-25 2021-11-19 深圳市麦吉通科技有限公司 Adaptive voice processing method for beam forming, mobile terminal and storage medium
CN108549113A (en) * 2018-04-12 2018-09-18 俞度立 A kind of method for testing performance and device of wave detector
CN108957392A (en) * 2018-04-16 2018-12-07 深圳市沃特沃德股份有限公司 Sounnd source direction estimation method and device
CN109611703B (en) * 2018-10-19 2021-06-22 宁波鄞州竹创信息科技有限公司 LED lamp convenient to installation
CN109490833B (en) * 2018-10-30 2022-11-15 重庆大学 GCC inverse model rapid sound source identification method of improved propagation matrix
CN109618273B (en) * 2018-12-29 2020-08-04 北京声智科技有限公司 Microphone quality inspection device and method
CN110133596B (en) * 2019-05-13 2023-06-23 江苏第二师范学院(江苏省教育科学研究院) Array sound source positioning method based on frequency point signal-to-noise ratio and bias soft decision
CN110136732A (en) * 2019-05-17 2019-08-16 湖南琅音信息科技有限公司 Two-channel intelligent acoustic signal processing method, system and audio frequency apparatus
CN110221246A (en) * 2019-05-20 2019-09-10 北京航空航天大学 A kind of unmanned plane localization method based on the fusion of multi-source direction finding message
CN110221250A (en) * 2019-06-27 2019-09-10 中国科学院西安光学精密机械研究所 A kind of abnormal sound localization method and positioning device
CN110488223A (en) * 2019-07-05 2019-11-22 东北电力大学 A kind of sound localization method
CN110600039B (en) * 2019-09-27 2022-05-20 百度在线网络技术(北京)有限公司 Method and device for determining speaker attribute, electronic equipment and readable storage medium
CN110726972B (en) * 2019-10-21 2022-09-16 南京南大电子智慧型服务机器人研究院有限公司 Voice sound source positioning method using microphone array under interference and high reverberation environment
CN111120223B (en) * 2019-12-16 2021-09-03 大连赛听科技有限公司 Blade fault monitoring method and device based on double arrays
CN112394324A (en) * 2020-10-21 2021-02-23 西安合谱声学科技有限公司 Microphone array-based remote sound source positioning method and system
CN112540346A (en) * 2020-12-07 2021-03-23 国网山西省电力公司大同供电公司 Sound source positioning method based on signal-to-noise ratio weight optimization updating
CN112799018B (en) * 2020-12-23 2023-07-18 北京有竹居网络技术有限公司 Sound source positioning method and device and electronic equipment
CN113466793B (en) * 2021-06-11 2023-10-17 五邑大学 Sound source positioning method and device based on microphone array and storage medium
CN113820662A (en) * 2021-08-02 2021-12-21 华南师范大学 Sound source direction positioning detection method
CN114325214A (en) * 2021-11-18 2022-04-12 国网辽宁省电力有限公司电力科学研究院 Electric power online monitoring method based on microphone array sound source positioning technology
CN116312447B (en) * 2023-02-09 2023-11-10 杭州兆华电子股份有限公司 Directional noise elimination method and system
CN116609726A (en) * 2023-05-11 2023-08-18 钉钉(中国)信息技术有限公司 Sound source positioning method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101762806A (en) * 2010-01-27 2010-06-30 华为终端有限公司 Sound source locating method and apparatus thereof
CN102411138A (en) * 2011-07-13 2012-04-11 北京大学 Method for positioning sound source by robot
CN102438189A (en) * 2011-08-30 2012-05-02 东南大学 Dual-channel acoustic signal-based sound source localization method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101762806A (en) * 2010-01-27 2010-06-30 华为终端有限公司 Sound source locating method and apparatus thereof
CN102411138A (en) * 2011-07-13 2012-04-11 北京大学 Method for positioning sound source by robot
CN102438189A (en) * 2011-08-30 2012-05-02 东南大学 Dual-channel acoustic signal-based sound source localization method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种修正的近场声源定位时延估计方法;杜要峰 等;《电声基础》;20101231;第34卷(第2期);第47-50,81页 *
基于麦克风阵列的三维声源定位算法及其实现;杨祥清 等;《声学技术》;20080430;第27卷(第2期);第260-265页 *

Also Published As

Publication number Publication date
CN102854494A (en) 2013-01-02

Similar Documents

Publication Publication Date Title
CN102854494B (en) A kind of sound localization method and device
CN109839612B (en) Sound source direction estimation method and device based on time-frequency masking and deep neural network
US11323807B2 (en) Echo cancellation method and apparatus based on time delay estimation
CN108731886B (en) A kind of more leakage point acoustic fix ranging methods of water supply line based on iteration recursion
CN102077274B (en) Multi-microphone voice activity detector
KR100899836B1 (en) Method and Apparatus for modeling room impulse response
EP2788980B1 (en) Harmonicity-based single-channel speech quality estimation
CN106226739A (en) Merge the double sound source localization method of Substrip analysis
CN109218957A (en) It utters long and high-pitched sounds detection method, device, electronic equipment and storage medium
TW201448616A (en) Method and apparatus for determining directions of uncorrelated sound sources in a Higher Order Ambisonics representation of a sound field
Worthmann et al. Adaptive frequency-difference matched field processing for high frequency source localization in a noisy shallow ocean
US20220051685A1 (en) Method for transforming audio signal, device, and storage medium
CN109102819A (en) One kind is uttered long and high-pitched sounds detection method and device
CN111798869B (en) Sound source positioning method based on double microphone arrays
CN107527624B (en) Voiceprint recognition method and device
Aktas et al. Acoustic direction finding using single acoustic vector sensor under high reverberation
Tian et al. Underwater multi-target passive detection based on transient signals using adaptive empirical mode decomposition
BR112014009647B1 (en) NOISE Attenuation APPLIANCE AND NOISE Attenuation METHOD
Mitre et al. Accurate and efficient fundamental frequency determination from precise partial estimates
US20240012093A1 (en) Improved location of an acoustic source
Astapov et al. Directional Clustering with Polyharmonic Phase Estimation for Enhanced Speaker Localization
CN101645701B (en) Time delay estimation method based on filter bank and system thereof
Cobos et al. Analysis of room reverberation effects in source localization using small microphone arrays
Lee et al. Deep neural network based blind estimation of reverberation time based on multi-channel microphones
Pessentheiner et al. Localization and characterization of multiple harmonic sources

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150909

CF01 Termination of patent right due to non-payment of annual fee