CN110007276A - A kind of sound localization method and system - Google Patents
A kind of sound localization method and system Download PDFInfo
- Publication number
- CN110007276A CN110007276A CN201910312565.6A CN201910312565A CN110007276A CN 110007276 A CN110007276 A CN 110007276A CN 201910312565 A CN201910312565 A CN 201910312565A CN 110007276 A CN110007276 A CN 110007276A
- Authority
- CN
- China
- Prior art keywords
- frame signal
- frame
- signal
- road
- obtains
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/18—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
- G01S5/20—Position of source determined by a plurality of spaced direction-finders
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Circuit For Audible Band Transducer (AREA)
- Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
Abstract
The invention discloses a kind of sound localization method and systems.Then the sound source voice signal adding window framing that sound localization method of the invention first obtains quaternary microphone array detects the effective frame signal of signal, and calculates the secondary relevant broad sense spectrum of fusion to the effective frame signal filtered out and subtract orrection phase place transforming function transformation function.To further increase time delay precision, subtract orrection phase place transforming function transformation function calculation delay value using secondary relevant average broad sense spectrum is merged.Sounnd source direction estimation is finally carried out according to the geometric position of microphone array and calculated time delay value, improves the precision of auditory localization.
Description
Technical field
The present invention relates to field of sound source location, in particular to a kind of sound localization method and system.
Background technique
Auditory localization has become a research hotspot of field of voice signal, in video conference, intelligent robot
And the fields such as intelligent video monitoring system are widely used.And traditional location algorithm is in low signal-to-noise ratio, high reverberation time
Adverse circumstances under, locating accuracy sharply declines.
Summary of the invention
The object of the present invention is to provide a kind of sound localization method and systems, to improve the accuracy rate of auditory localization.
To achieve the above object, the present invention provides following schemes:
The present invention provides a kind of sound localization method, and the sound localization method includes the following steps:
Four tunnel sound source voice signals are collected using quaternary microphone array;The quaternary microphone array includes four
Microphone, each microphone acquire sound source voice signal all the way;
The sound source voice signal described in four tunnels synchronizes sub-frame processing, obtains frame signal set, the signal frame set
In each frame signal include four tunnel frame signals, respectively first via frame signal, the second tunnel frame signal, third road frame signal and the
Four tunnel frame signals;
The validity for judging each frame signal in the frame signal set obtains valid frame signal subset;
According to the valid frame signal subset, the secondary relevant average broad sense spectrum of any two-way valid frame signal fused is obtained
Subtract orrection phase place transforming function transformation function;
It obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts orrection phase place transforming function transformation function most
At time point corresponding to big peak value, obtain the time delay value of any two-way microphone sound-source signal;
According to the time delay value of the geometric position of the quaternary microphone array and any two-way microphone sound-source signal, determine
The direction position of sound source.
Optionally, the sound source voice signal described in four tunnels synchronizes sub-frame processing, obtains frame signal set, specifically
Include:
Using window functionThe sound source voice signal described in four tunnels synchronizes at adding window framing
Reason, obtains frame signal xij(n), n indicates that n-th of sampled point, n=1,2 ..., N, N indicate frame length, xij(n) i-th of frame is indicated
The signal on the jth road of signal, j=1,2,3,4;
All frame signals are synthesized into frame signal set.
Optionally, the validity for judging each frame signal in the frame signal set, obtains valid frame signal subset,
It specifically includes:
Utilize formulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal;Wherein, Eij
Indicate that the short time frame energy of the jth road frame signal of i-th of frame signal, n indicate that n-th of sampled point, n=1,2 ..., N, N indicate
Frame length;
Judge whether the short time frame energy of the jth road frame signal of i-th of frame signal is greater than the first preset threshold, obtains first
Judging result;
If first judging result indicates that the short time frame energy is not more than first preset threshold, by the value of i
Increase by 1, return step " utilizes formulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal ";
If first judging result indicates that the battle array energy in short-term is greater than first preset threshold, by i-th of frame
Signal is set as starting point, and the value of i is increased by 1;
Utilize formulaCalculate the jth road frame signal of i-th of frame signal
Zero-crossing rate;Wherein,
Judge whether the zero-crossing rate is greater than the second preset threshold, obtains the second judging result;
If second judging result indicates that the zero-crossing rate is greater than second preset threshold, by i-th of frame signal
Jth road frame signal label TijIt is set as 1;
If described, judging result indicates that the zero-crossing rate is not more than second preset threshold, and i-th of frame is believed
Number jth road frame signal label TijIt is set as 0;
Utilize formula S S (i)=Ti1&&Ti2&&Ti3&&Ti4, calculate the total of the label of four tunnel frame signals of i-th of frame signal
State value SS (i);Wherein, Ti1、Ti2、Ti3And Ti4Respectively indicate the 1st tunnel, the 2nd tunnel, the 3rd road and the 4th road frame of i-th of frame signal
The label of signal;
Judge whether total state value SS (i) is equal to 1, obtains third judging result;
If the third judging result indicates that SS (i) is equal to 1, effective signal frame is set by i-th of signal frame;
Judge whether the short time frame energy of the jth road frame signal of i-th of frame signal is less than third predetermined threshold value, obtains the 4th
Judging result;
If the 4th judging result indicates that the short time frame energy of the jth road frame signal of i-th of frame signal is less than described the
Three preset thresholds then set i-th of signal frame to the terminating point of voice signal, obtain valid frame signal subset;
If the 4th judging result indicates the short time frame energy of the jth road frame signal of i-th of frame signal not less than described
The value of i is then increased by 1 by third predetermined threshold value, and return step " utilizes formulaMeter
Calculate the zero-crossing rate of the jth road frame signal of i-th of frame signal ".
Optionally, described according to the valid frame signal subset, obtain any secondary correlation of two-way valid frame signal fused
Average broad sense spectrum subtract orrection phase place transforming function transformation function, specifically include:
According to the valid frame signal subset, the secondary correlation of each effectively any two-way frame signal of frame signal is calculated;
According to the valid frame signal subset, the power spectrum of each effectively every road frame signal of frame signal is calculated;
According to the power spectrum of every road frame signal, the masking by noise function of each effectively every road frame signal of frame signal is obtained:
Wherein, zpq(ω) indicates the masking by noise function of the road the q frame signal of p-th of effective frame signal, Xpq(ω) is indicated
The power spectrum of the road the q frame signal of p-th of effective frame signal, q=1,2,3,4, N (ω) noise power spectrums, α indicate the first system
Number, β indicate the second coefficient;
According to the secondary of the masking by noise function of every road frame signal of each effective frame signal and any two-way frame signal
Correlation, the secondary relevant broad sense spectrum of any two-way frame signal fusion for obtaining each effectively frame signal subtract orrection phase place transformation letter
Number:
Wherein, φls_p(ω) indicates that the road the l frame signal of p-th of effective frame signal and the road s frame signal merge secondary phase
The broad sense of pass, which is composed, subtracts orrection phase place transforming function transformation function, l=1,2,3,4, s=1,2,3,4, l ≠ s,
Xpl(ω) and Xps(ω) respectively indicates the power spectrum and the road s frame of the road the l frame signal of p-th of effective frame signal
The power spectrum of signal, ρ indicate third coefficient;
Subtract orrection phase place transformation according to the secondary relevant broad sense spectrum of any two-way frame signal of each effective frame signal fusion
Function obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts orrection phase place transforming function transformation function:
Wherein,Indicate the effective frame signal in the road l and the secondary relevant average broad sense of the road s valid frame signal fused
Spectrum subtracts orrection phase place transforming function transformation function, and P indicates that valid frame signal subspace concentrates the quantity of effective frame signal.
Optionally, the geometric position according to the quaternary microphone array and any two-way microphone sound-source signal
Time delay value determines the direction position of sound source, specifically includes:
According to the time delay value of the geometric position of the quaternary microphone array and any two-way microphone sound-source signal, utilize
FormulaAzimuth angle theta of the calculating sound source to coordinate origin
According to the time delay value of the geometric position of the quaternary microphone array and any two-way microphone sound-source signal, utilize
FormulaAzimuth pitch angle of the calculating sound source to coordinate origin
Wherein, c is the velocity of sound, and d is distance of the microphone array element to coordinate origin, τ12Indicate No. 1st microphone sound-source signal
With the time delay value of No. 2nd microphone sound-source signal, τ13Indicate No. 1st microphone sound-source signal and No. 3rd microphone sound-source signal
Time delay value, τ14Indicate the time delay value of No. 1st microphone sound-source signal and No. 4th microphone sound-source signal.
Optionally, the sound source voice signal described in four tunnels synchronizes sub-frame processing, obtains frame signal set, before
Further include:
The sound source voice signal described in every road carries out speech enhan-cement processing, obtains speech enhan-cement treated signal;
Bandpass filtering treatment is carried out to the speech enhan-cement treated signal, the signal after obtaining bandpass filtering treatment;
Wavelet threshold denoising is carried out to the signal after the bandpass filtering treatment, obtains pretreated sound source voice letter
Number.
A kind of sonic location system, the sonic location system include:
Sound source voice signal obtains module, for collecting four tunnel sound source voice signals using quaternary microphone array;
The quaternary microphone array includes four microphones, and each microphone acquires sound source voice signal all the way;
Framing module synchronizes sub-frame processing for the sound source voice signal described in four tunnels, obtains frame signal set, institute
Stating each frame signal in signal frame set includes four tunnel frame signals, respectively first via frame signal, the second tunnel frame signal, third
Road frame signal and the 4th tunnel frame signal;
Valid frame signal subset obtains module and obtains for judging the validity of each frame signal in the frame signal set
To valid frame signal subset;
It merges secondary relevant average broad sense spectrum and subtracts orrection phase place transforming function transformation function acquisition module, for according to the valid frame
Signal subset obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts orrection phase place transforming function transformation function;
Time delay value computing module subtracts for obtaining the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and repairs
Time point corresponding to the peak-peak of positive phase transforming function transformation function obtains the time delay value of any two-way microphone sound-source signal;
Direction position determination module, for according to the quaternary microphone array geometric position and any two-way microphone
The time delay value of sound-source signal determines the direction position of sound source.
Optionally, the framing module, specifically includes:
Sub-frame processing submodule, for using window functionThe letter of the sound source voice described in four tunnels
Number adding window sub-frame processing is synchronized, obtains frame signal xij(n), n indicates that n-th of sampled point, n=1,2 ..., N, N indicate frame
It is long, xij(n) signal on the jth road of i-th of frame signal of expression, j=1,2,3,4;
Submodule is synthesized, for all frame signals to be synthesized frame signal set.
Optionally, the valid frame signal subset obtains module, specifically includes:
Short time frame energy balane submodule, for utilizing formulaCalculate the jth road frame of i-th of frame signal
The short time frame energy of signal;Wherein, EijIndicate that the short time frame energy of the jth road frame signal of i-th of frame signal, n indicate to adopt for n-th
Sampling point, n=1,2 ..., N, N indicate frame length;
First judging submodule, for judging whether the short time frame energy of jth road frame signal of i-th of frame signal is greater than
One preset threshold obtains the first judging result;
First judging result handles submodule, if indicating that the short time frame energy is not more than for first judging result
The value of i is then increased by 1 by first preset threshold, calls short time frame energy balane submodule, is executed step and " is utilized formulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal ";If first judging result indicates
The energy of battle array in short-term is greater than first preset threshold, then sets starting point for i-th of frame signal, and the value of i is increased by 1;
Zero-crossing rate computational submodule, for utilizing formulaIt calculates i-th
The zero-crossing rate of the jth road frame signal of frame signal;Wherein,
Second judgment submodule obtains the second judgement knot for judging whether the zero-crossing rate is greater than the second preset threshold
Fruit;
Second judging result handles submodule, if indicating that the zero-crossing rate is greater than described for second judging result
Two preset thresholds, then by the label T of the jth road frame signal of i-th of frame signalijIt is set as 1;If described, judging result is indicated
The zero-crossing rate is not more than second preset threshold, then by the label T of the jth road frame signal of i-th of frame signalijIt is set as 0;
Total state value SS (i) computational submodule, for utilizing formula S S (i)=Ti1&&Ti2&&Ti3&&Ti4, calculate i-th
Total state value SS (i) of the label of four tunnel frame signals of frame signal;Wherein, Ti1、Ti2、Ti3And Ti4Respectively indicate i-th of frame signal
The 1st tunnel, the 2nd tunnel, the 3rd road and the 4th tunnel frame signal label;
Third judging submodule obtains third judging result for judging whether total state value SS (i) is equal to 1;
Third result treatment submodule, if indicating that SS (i) is equal to 1 for the third judging result, by i-th of signal
Frame is set as effective signal frame;
4th judging submodule, for judge i-th of frame signal jth road frame signal short time frame energy whether less than
Three preset thresholds obtain the 4th judging result;
4th judging result handles submodule, if indicating the jth road frame of i-th of frame signal for the 4th judging result
The short time frame energy of signal is less than the third predetermined threshold value, then sets i-th of signal frame to the terminating point of voice signal, obtain
To valid frame signal subset;If the 4th judging result indicates the short time frame energy of the jth road frame signal of i-th of frame signal not
Less than the third predetermined threshold value, then the value of i is increased by 1, call zero-crossing rate computational submodule, executed step and " utilize formulaCalculate the zero-crossing rate of the jth road frame signal of i-th of frame signal ".
Optionally, the secondary relevant average broad sense spectrum of the fusion subtracts orrection phase place transforming function transformation function acquisition module, specific to wrap
It includes:
Secondary correlation computational submodule, for calculating any the two of each effectively frame signal according to valid frame signal subset
The secondary correlation of road frame signal;
Spectra calculation submodule, for calculating every road frame of each effectively frame signal according to the useful signal subset
The power spectrum of signal;
Masking by noise function acquisition submodule obtains each effective frame signal for the power spectrum according to every road frame signal
Every road frame signal masking by noise function:
Wherein, zpq(ω) indicates the masking by noise function of the road the q frame signal of p-th of effective frame signal, Xpq(ω) is indicated
The power spectrum of the road the q frame signal of p-th of effective frame signal, q=1,2,3,4, N (ω) noise power spectrums, α indicate the first system
Number, β indicate the second coefficient;
It merges secondary relevant broad sense spectrum and subtracts orrection phase place transforming function transformation function acquisition submodule, for being believed according to each valid frame
Number every road frame signal masking by noise function and any two-way frame signal secondary correlation, obtain each effectively frame signal
Any secondary relevant broad sense spectrum of two-way frame signal fusion subtracts orrection phase place transforming function transformation function:
Wherein, φls_p(ω) indicates that the road the l frame signal of p-th of effective frame signal and the road s frame signal merge secondary phase
The broad sense of pass, which is composed, subtracts orrection phase place transforming function transformation function, l=1,2,3,4, s=1,2,3,4, l ≠ s,
Xpl(ω) and Xps(ω) respectively indicates the power spectrum and the road s frame of the road the l frame signal of p-th of effective frame signal
The power spectrum of signal, ρ indicate third coefficient;
It merges secondary relevant average broad sense spectrum and subtracts orrection phase place transforming function transformation function acquisition submodule, for according to each effective
The secondary relevant broad sense spectrum of any two-way frame signal fusion of frame signal subtracts orrection phase place transforming function transformation function, and it is effective to obtain any two-way
The secondary relevant average broad sense spectrum of frame signal fusion subtracts orrection phase place transforming function transformation function:
Wherein,Indicate the effective frame signal in the road l and the secondary relevant average broad sense of the road s valid frame signal fused
Spectrum subtracts orrection phase place transforming function transformation function, and P indicates that valid frame signal subspace concentrates the quantity of effective frame signal.
The specific embodiment provided according to the present invention, the invention discloses following technical effects:
The invention discloses a kind of sound localization method and systems.Sound localization method of the invention is first to quaternary Mike
Wind array obtains the adding window framing of color sound source voice signal, then detects the effective frame signal of signal, and to the valid frame filtered out
Signal calculates the secondary relevant broad sense spectrum of fusion and subtracts orrection phase place transforming function transformation function.To further increase time delay precision, using fusion
Secondary relevant average broad sense spectrum subtracts orrection phase place transforming function transformation function calculation delay value.Finally according to the geometric position of microphone array
Sounnd source direction estimation is carried out with calculated time delay value, improves the precision of auditory localization.
Detailed description of the invention
It in order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will be to institute in embodiment
Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention
Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings
Obtain other attached drawings.
Fig. 1 is a kind of flow chart of sound localization method provided by the invention;
Fig. 2 is the illustraton of model of quaternary microphone permutation provided by the invention;
Fig. 3 is the accuracy rate comparison of Delay Estima-tion of the algorithms of different in each frame under -5dB noise circumstance provided by the invention
Texts and pictures;
Fig. 4 is the accuracy rate comparison of Delay Estima-tion of the algorithms of different in each frame under 5dB noise circumstance provided by the invention
Texts and pictures;
Fig. 5 be noise provided by the invention be the 5dB reverberation time be under 750ms environment algorithms of different in the delay of each frame
The accuracy rate of estimation compares texts and pictures;
Fig. 6 is capture card pictorial diagram provided by the invention;
Fig. 7 is the pictorial diagram of microphone provided by the invention;
Fig. 8 is the pictorial diagram of quaternary microphone array provided by the invention;
Fig. 9 is a kind of structure chart of sonic location system provided by the invention.
Specific embodiment
The object of the present invention is to provide a kind of sound localization method and systems, to improve the accuracy rate of auditory localization.
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real
Mode is applied to be described in further detail invention.
Embodiment 1
The embodiment of the present invention 1 provides a kind of sound localization method.
As shown in Figure 1, the sound localization method includes the following steps:
Step 101, four tunnel sound source voice signals are collected using quaternary microphone array;The quaternary microphone array
Including four microphones, each microphone acquires sound source voice signal all the way;Step 102, the sound source voice signal described in four tunnels
Sub-frame processing is synchronized, frame signal set is obtained, each frame signal in the signal frame set includes four tunnel frame signals, point
It Wei not first via frame signal, the second tunnel frame signal, third road frame signal and the 4th tunnel frame signal;Step 103, judge the frame letter
Number set in each frame signal validity, obtain valid frame signal subset;Step 104, according to the valid frame signal subset,
It obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts orrection phase place transforming function transformation function;Step 105, it obtains
Any secondary relevant average broad sense spectrum of two-way valid frame signal fused subtracts corresponding to the peak-peak of orrection phase place transforming function transformation function
Time point, obtain the time delay value of any two-way microphone sound-source signal;Step 106, according to the quaternary microphone array
The time delay value of geometric position and any two-way microphone sound-source signal determines the direction position of sound source.
Embodiment 2
The embodiment of the present invention 2 provides an a kind of preferred embodiment of sound localization method, but reality of the invention
It applies and is not limited to embodiment defined by the embodiment of the present invention 2.
Quaternary microphone array described in step 101 is as shown in Fig. 2, the coordinate of quaternary battle array microphone is m1(d, 0,0), m2(0,
D, 0), m3(- d, 0,0), m4(0 ,-d, 0), d are microphone array element to initial point distance.
After obtaining four tunnel sound source voice signals, the sound source voice signal described in every road carries out speech enhan-cement processing, obtains
Speech enhan-cement treated signal;Bandpass filtering treatment is carried out to the speech enhan-cement treated signal, obtains bandpass filtering
Treated signal;Wavelet threshold denoising is carried out to the signal after the bandpass filtering treatment, obtains pretreated sound source language
Sound signal.
The sound source voice signal described in four tunnels described in step 102 synchronizes sub-frame processing, obtains frame signal set, specifically
It include: using window functionThe sound source voice signal described in four tunnels synchronizes adding window sub-frame processing,
Obtain frame signal xij(n), n indicates that n-th of sampled point, n=1,2 ..., N, N indicate frame length, xij(n) i-th of frame signal is indicated
Jth road signal, j=1,2,3,4;All frame signals are synthesized into frame signal set.
The validity that each frame signal in the frame signal set is judged described in step 103 obtains valid frame signal subset,
It specifically includes: utilizing formulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal;Wherein,
EijIndicate that the short time frame energy of the jth road frame signal of i-th of frame signal, n indicate n-th of sampled point, n=1,2 ..., N, N table
Show frame length;Judge whether the short time frame energy of the jth road frame signal of i-th of frame signal is greater than the first preset threshold, obtains first
Judging result;If first judging result indicates that the short time frame energy is not more than first preset threshold, by the value of i
Increase by 1, return step " utilizes formulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal
Amount ";If first judging result indicates that the battle array energy in short-term is greater than first preset threshold, by i-th of frame signal
It is set as starting point, the value of i is increased by 1;Utilize formulaCalculate i-th of frame letter
Number jth road frame signal zero-crossing rate;Wherein,Judge whether the zero-crossing rate is greater than second
Preset threshold obtains the second judging result;If second judging result indicates that the zero-crossing rate is greater than the described second default threshold
Value, then by the label T of the jth road frame signal of i-th of frame signalijIt is set as 1;If described, judging result indicates the zero passage
Rate is not more than second preset threshold, then by the label T of the jth road frame signal of i-th of frame signalijIt is set as 0;Utilize formula
SS (i)=Ti1&&Ti2&&Ti3&&Ti4, calculate total state value SS (i) of the label of four tunnel frame signals of i-th of frame signal;Wherein,
Ti1、Ti2、Ti3And Ti4Respectively indicate the 1st tunnel of i-th of frame signal, the label on the 2nd tunnel, the 3rd road and the 4th tunnel frame signal;Judge institute
It states whether total state value SS (i) is equal to 1, obtains third judging result;If the third judging result indicates that SS (i) is equal to 1,
Effective signal frame is set by i-th of signal frame;Judge whether the short time frame energy of the jth road frame signal of i-th of frame signal is small
In third predetermined threshold value, the 4th judging result is obtained;If the 4th judging result indicates the jth road frame letter of i-th of frame signal
Number short time frame energy be less than the third predetermined threshold value, then set i-th of signal frame to the terminating point of voice signal, obtain
Valid frame signal subset;If the 4th judging result indicates that the short time frame energy of the jth road frame signal of i-th of frame signal is not small
In the third predetermined threshold value, then the value of i is increased by 1, return step " utilizes formulaMeter
Calculate the zero-crossing rate of the jth road frame signal of i-th of frame signal ".
According to the valid frame signal subset described in step 104, any secondary correlation of two-way valid frame signal fused is obtained
Average broad sense spectrum subtract orrection phase place transforming function transformation function, specifically include: according to the valid frame signal subset, calculating each valid frame
The secondary correlation of any two-way frame signal of signal;According to the valid frame signal subset, the every of each effectively frame signal is calculated
The power spectrum of road frame signal;According to the power spectrum of every road frame signal, the noise of each effectively every road frame signal of frame signal is obtained
Shelter function:
Wherein, zpq(ω) indicates the masking by noise function of the road the q frame signal of p-th of effective frame signal, Xpq(ω) is indicated
The power spectrum of the road the q frame signal of p-th of effective frame signal, q=1,2,3,4, N (ω) noise power spectrums, α indicate the first system
Number, β indicate the second coefficient;Believed according to the masking by noise function of every road frame signal of each effective frame signal and any two-way frame
Number secondary correlation, the secondary relevant broad sense spectrum of any two-way frame signal fusion for obtaining each effectively frame signal subtracts orrection phase place
Transforming function transformation function:
Wherein, φls_p(ω) indicates that the road the l frame signal of p-th of effective frame signal and the road s frame signal merge secondary phase
The broad sense of pass, which is composed, subtracts orrection phase place transforming function transformation function, l=1,2,3,4, s=1,2,3,4, l ≠ s,
Xpl(ω) and Xps(ω) respectively indicates the power spectrum and the road s frame of the road the l frame signal of p-th of effective frame signal
The power spectrum of signal, ρ indicate third coefficient;It is secondary relevant wide according to the fusion of any two-way frame signal of each effective frame signal
Justice spectrum subtracts orrection phase place transforming function transformation function, obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts amendment phase
Bit map function:
Wherein,Indicate the effective frame signal in the road l and the secondary relevant average broad sense of the road s valid frame signal fused
Spectrum subtracts orrection phase place transforming function transformation function, and P indicates that valid frame signal subspace concentrates the quantity of effective frame signal.
According to the geometric position of the quaternary microphone array and any two-way microphone sound-source signal described in step 105
Time delay value determines the direction position of sound source, specifically includes:
According to the time delay value of the geometric position of the quaternary microphone array and any two-way microphone sound-source signal, utilize
FormulaAzimuth angle theta of the calculating sound source to coordinate origin
According to the time delay value of the geometric position of the quaternary microphone array and any two-way microphone sound-source signal, utilize
FormulaAzimuth pitch angle of the calculating sound source to coordinate origin
Wherein, c is the velocity of sound, and d is distance of the microphone array element to coordinate origin, τ12Indicate No. 1st microphone sound-source signal
With the time delay value of No. 2nd microphone sound-source signal, τ13Indicate No. 1st microphone sound-source signal and No. 3rd microphone sound-source signal
Time delay value, τ14Indicate the time delay value of No. 1st microphone sound-source signal and No. 4th microphone sound-source signal.Specifically, according to institute
State quaternary microphone this column geometry site (coordinate of quaternary battle array microphone be m1 (d, 0,0), m2 (0, d, 0), m3 (- d,
0,0), (0 ,-d, 0) m4), the calculation formula x of spherical coordinates2+y2+z2=r2, distance between two points calculation formulaAnd speed formulaSolve azimuthAnd pitching
Angle
In order to illustrate a kind of effect of sound localization method of the invention, the present invention is under different signal-to-noise ratio and reverberant ambiance
Carry out analog simulation comparison, from Fig. 3,4 it can be seen that under medium noise circumstance (SNB (signal-to-noise ratio Signal-to-noise
Ratio)=5dB), phse conversion (PHAT, the Phase Transform) algorithm estimation time delay value accuracy performance can not show a candle to
It improves cross-power phase algorithm (MCPSP, Modified Cross Power Spectrum Phase) and broad sense spectrum subtracts amendment mutually
Correlation function (GCC-APHAT, Generalized spectral subtraction ameliorated phase
Transformation) method, and APHAT is substantially better than MCPSP method;(SNB=-5dB), PHAT under strong noise environment
Can sharply it decline, only MCPSP and APHAT also maintain preferable performance.From figure 5 it can be seen that strong reverberation and making an uproar by force
Under all existing environmental condition of sound (T60=750ms, SNB=-5dB), APHAT algorithm has compared to PHAT and MCPSP algorithm
Preferable time delay precision.More than simultaneous analysis comparison can verify that APHAT algorithm has preferable robustness to noise and reverberation.
For a kind of effect of further instruction sound localization method of the invention, this law is bright to be built and true environment is real
It tests, sound-source signal recording is carried out using Beijing Sheng Kece acoustic technique Co., Ltd (SKC) multi-channel data acquisition board Q801,
As shown in fig. 6, the array bracket and microphone MP40 in quaternary microphone array are SKC vendor products, as shown in FIG. 7 and 8.
Experiment is completed in an interior 7.2m × 6m × 3.2m, and laboratory door and window is all closed.Known room memory
In certain ambient noise and reverberation, the reflection such as the sound tables and chairs including host computer fan and other human interferences etc., sound source is
Schoolgirl's speech utterance (I goes to Beijing) is all the Duan Yuyin recorded under practical circumstances.The sample rate of signal is 8kHz, point
Frame frame length is 256, and it is 128 that frame, which moves, adds Hamming window.The coordinate of quaternary microphone array is respectively as follows: m1(25cm, 0,0), m2(0,
25cm, 0), m3(- 25cm, 0,0), m4(0, -25cm, 0), it is 70cm that microphone array, which puts height distance ground,.In actual rings
Experimental data compares under border, and multi-acoustical position in the following table 1 is chosen in experiment respectively, and each sound source position acquires 10 groups of data.This
Invention compares the property of proposed algorithm APHAT Yu other two kinds of algorithms by experimental analysis using PHAT, MCPSP algorithm as reference
It can superiority and inferiority.Table 1 is the knot that experimental data compares innovatory algorithm APHAT and the practical auditory localization of PHAT, MCPSP under practical circumstances
As shown in table 1, the results are shown in Table 2 for position error, and position root-mean-square error is as shown in table 3: table 1 for fruit comparison
Serial number | S(x,y,z) | (r,θ,φ) | PHAT | MCPSP | APHAT |
1 | (1,1,0.76) | (1.6,45°,61.6°) | (45°,58.6°) | (45°,58.6°) | (45°,57.3°) |
2 | (2,1,0.76) | (2.36,26.6°,71.2°) | (26.6°,74.6°) | (26.6°,74.6°) | (26.6°,71.9°) |
3 | (2,2,076) | (2.93,45°,75°) | (45°,77.4°) | (45°,77.4°) | (45°,74.1°) |
4 | (-2,1,0.76) | (2.36,-26.6°,71.2°) | (-26.6°,74.6°) | (-26.6°,74.6°) | (-26.6°,71.9°) |
5 | (-2,2,0.76) | (2.93,-45°,75°) | (-45°,77.4°) | (-45°,77.4°) | (-45°,74.1°) |
6 | (1.2,0.6,076) | (1.54,26.6°,60.4°) | (37.9°,79.5°) | (29.1°,62.6°) | (29.1°,61.1°) |
7 | (-2.4,2.4,0.76) | (3.48,-45°,77.4°) | (-45°,77.4°) | (-45°,77.4°) | (-45°,74.1°) |
8 | (1.5,1.2,0.76) | (2.07,38.7°,68.5°) | (41.2°,66.5°) | (37.9°,79.5°) | (41.2°,64.6°) |
9 | (1.8,1.2,076) | (2.29,33.7°,70.6°) | (0,234°) | (37.9°,79.5°) | (37.9°,75.7°) |
10 | (2,1.2,0.76) | (2.45,31°,71.9°) | (-36.9°,59.6°) | (30.1°,90.2°) | (31°,82.4°) |
11 | (1.2,0,0.76) | (1.42,0°,57.7°) | (0,59.6°) | (0,59.6°) | (0,58.2°) |
12 | (0,1.8,0.76) | (1.95,90°,67.1°) | (0,234°) | (-90°,71.6°) | (90°,69.2°) |
13 | (1.2,2.4,076 | (2.79,63.4°,74.2°) | (63.4°,74.6°) | (63.4°,74.6°) | (63.4°,71.9°) |
14 | (0,1.2,0.76) | (1.42,90°,57.7°) | (-90°,59.6°) | (-90°,59.6°) | (90°,58.2°) |
15 | (-1.2,0,0.76) | (1.42,180°,57.7°) | (180,234°) | (180,59.6°) | (180°,58.7°) |
16 | (0,-1.2,0.76) | (1.42,-90°,57.7°) | (90°,59.6°) | (90°,59.6°) | (-90°,58.2°) |
17 | (-0.6,-1.2,0.76) | (1.54,63.4°,60.4°) | (60.9°,62.6°) | (60.9°,62.6°) | (60.9 °, 61.1 °) |
18 | (0.6,-1.2,0.76) | (1.54,-63.4°,60.4°) | (-63.4°,74.6°) | (-68.2°,68.3°) | (- 68.2 °, 66.3 °) |
Table 2
Table 3
PHAT | MCPSP | APHAT | |
Azimuth angle thetaRMSE | 59.4 | 68.6 | 1.6 |
Pitch angle φRMSE | 63.2 | 4.6 | 2.7 |
By the experimental result comparative analysis of table 1,2 it can be seen that under practical circumstances, PHAT algorithm estimation orientation angle pitching
Angle performance is unstable and error is larger.MCPSP algorithm orientation angular estimation is positioned when sound source is in system coordinates X-axis and Y-axis
It will appear orientation opposite errors;And mentioned APHAT algorithm positioning performance is stable and positioning progress is higher, it can from table 3
The azimuthal root-mean-square error REMS of APHAT algorithm is 1.6 out, and the root-mean-square error REMS of pitch angle is 2.7;Algorithm APHAT
Orientation angle biased error is receiving in range substantially, and precision is relatively high.This also demonstrates the validity of proposed algorithm herein
Energy.
Embodiment 3
Embodiment is 3 present invention provide a kind of sonic location system.
As shown in figure 9, the present invention provides a kind of sonic location system, the sonic location system includes: sound source voice letter
Number module 901 is obtained, for collecting four tunnel sound source voice signals using quaternary microphone array;The quaternary microphone array
Column include four microphones, and each microphone acquires sound source voice signal all the way;Framing module 902 is used for the sound described in four tunnels
Source voice signal synchronizes sub-frame processing, obtains frame signal set, and each frame signal in the signal frame set includes four
Road frame signal, respectively first via frame signal, the second tunnel frame signal, third road frame signal and the 4th tunnel frame signal;Valid frame letter
Work song collection obtains module 903 and obtains valid frame signal subspace for judging the validity of each frame signal in the frame signal set
Collection;It merges secondary relevant average broad sense spectrum and subtracts orrection phase place transforming function transformation function acquisition module 904, for being believed according to the valid frame
Work song collection obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts orrection phase place transforming function transformation function;Time delay
It is worth computing module 905, subtracts orrection phase place change for obtaining the secondary relevant average broad sense spectrum of any two-way valid frame signal fused
Time point corresponding to the peak-peak of exchange the letters number obtains the time delay value of any two-way microphone sound-source signal;Direction position is true
Cover half block 906, for according to the geometric position of the quaternary microphone array and the time delay of any two-way microphone sound-source signal
Value, determines the direction position of sound source.
Embodiment 4
The embodiment of the present invention 4 provides an a kind of preferred embodiment of sonic location system.
The framing module 902, specifically includes: sub-frame processing submodule, for using window functionThe sound source voice signal described in four tunnels synchronizes adding window sub-frame processing, obtains frame signal xij
(n), n indicates that n-th of sampled point, n=1,2 ..., N, N indicate frame length, xij(n) letter on the jth road of i-th of frame signal is indicated
Number, j=1,2,3,4;Submodule is synthesized, for all frame signals to be synthesized frame signal set.
The valid frame signal subset obtains module 903, specifically includes: short time frame energy balane submodule, for utilizing
FormulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal;Wherein, EijIndicate i-th of frame
The short time frame energy of the jth road frame signal of signal, n indicate that n-th of sampled point, n=1,2 ..., N, N indicate frame length;First sentences
Disconnected submodule is obtained for judging whether the short time frame energy of jth road frame signal of i-th of frame signal is greater than the first preset threshold
To the first judging result;First judging result handles submodule, if indicating the short time frame energy for first judging result
Amount is not more than first preset threshold, then the value of i is increased by 1, calls short time frame energy balane submodule, executes step " benefit
Use formulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal ";If the first judgement knot
Fruit indicates that the battle array energy in short-term is greater than first preset threshold, then starting point is set by i-th of frame signal, by the value of i
Increase by 1;Zero-crossing rate computational submodule, for utilizing formulaCalculate i-th of frame
The zero-crossing rate of the jth road frame signal of signal;Wherein,Second judgment submodule, for judging
Whether the zero-crossing rate is greater than the second preset threshold, obtains the second judging result;Second judging result handles submodule, if for
Second judging result indicates that the zero-crossing rate is greater than second preset threshold, then believes the jth road frame of i-th of frame signal
Number label TijIt is set as 1;If described, judging result indicates that the zero-crossing rate is not more than second preset threshold, will
The label T of the jth road frame signal of i-th of frame signalijIt is set as 0;
Total state value SS (i) computational submodule, for utilizing formula S S (i)=Ti1&&Ti2&&Ti3&&Ti4, calculate i-th
Total state value SS (i) of the label of four tunnel frame signals of frame signal;Wherein, Ti1、Ti2、Ti3And Ti4Respectively indicate i-th of frame signal
The 1st tunnel, the 2nd tunnel, the 3rd road and the 4th tunnel frame signal label;Third judging submodule, for judging total state value SS
(i) whether it is equal to 1, obtains third judging result;Third result treatment submodule, if indicating SS for the third judging result
(i) it is equal to 1, then sets effective signal frame for i-th of signal frame;4th judging submodule, for judging i-th of frame signal
Whether the short time frame energy of jth road frame signal is less than third predetermined threshold value, obtains the 4th judging result;The processing of 4th judging result
Submodule, if indicating the short time frame energy of the jth road frame signal of i-th of frame signal less than described for the 4th judging result
Third predetermined threshold value then sets i-th of signal frame to the terminating point of voice signal, obtains valid frame signal subset;If described
4th judging result indicates that the short time frame energy of the jth road frame signal of i-th of frame signal is not less than the third predetermined threshold value, then
The value of i is increased by 1, calls zero-crossing rate computational submodule, step is executed and " utilizes formula
Calculate the zero-crossing rate of the jth road frame signal of i-th of frame signal ".
The secondary relevant average broad sense spectrum of fusion subtracts orrection phase place transforming function transformation function and obtains module 904, specifically includes: two
Secondary relevant calculation submodule, for calculating each effectively any two-way frame signal of frame signal according to valid frame signal subset
Secondary correlation;Spectra calculation submodule, for calculating every road frame of each effectively frame signal according to the useful signal subset
The power spectrum of signal;Masking by noise function acquisition submodule obtains each valid frame for the power spectrum according to every road frame signal
The masking by noise function of every road frame signal of signal:
Wherein, zpq(ω) indicates the masking by noise function of the road the q frame signal of p-th of effective frame signal, Xpq(ω) is indicated
The power spectrum of the road the q frame signal of p-th of effective frame signal, q=1,2,3,4, N (ω) noise power spectrums, α indicate the first system
Number, β indicate the second coefficient;It merges secondary relevant broad sense spectrum and subtracts orrection phase place transforming function transformation function acquisition submodule, for according to every
The masking by noise function of every road frame signal of a effective frame signal and the secondary correlation of any two-way frame signal, acquisition each have
The secondary relevant broad sense spectrum of any two-way frame signal fusion of effect frame signal subtracts orrection phase place transforming function transformation function:
Wherein, φls_p(ω) indicates that the road the l frame signal of p-th of effective frame signal and the road s frame signal merge secondary phase
The broad sense of pass, which is composed, subtracts orrection phase place transforming function transformation function, l=1,2,3,4, s=1,2,3,4, l ≠ s,
Xpl(ω) and Xps(ω) respectively indicates the power spectrum and the road s frame of the road the l frame signal of p-th of effective frame signal
The power spectrum of signal, ρ indicate third coefficient;It merges secondary relevant average broad sense spectrum and subtracts orrection phase place transforming function transformation function acquisition submodule
Block subtracts orrection phase place transformation letter for the secondary relevant broad sense spectrum of any two-way frame signal fusion according to each effectively frame signal
Number obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts orrection phase place transforming function transformation function:
Wherein,Indicate the effective frame signal in the road l and the secondary relevant average broad sense of the road s valid frame signal fused
Spectrum subtracts orrection phase place transforming function transformation function, and P indicates that valid frame signal subspace concentrates the quantity of effective frame signal.
The specific embodiment provided according to the present invention, the invention discloses following technical effects:
The invention discloses a kind of sound localization method and systems.Sound localization method of the invention is first to quaternary Mike
Wind array obtains the adding window framing of color sound source voice signal, then detects the effective frame signal of signal, and to the valid frame filtered out
Signal calculates the secondary relevant broad sense spectrum of fusion and subtracts orrection phase place transforming function transformation function.To further increase time delay precision, using fusion
Secondary relevant average broad sense spectrum subtracts orrection phase place transforming function transformation function calculation delay value.Finally according to the geometric position of microphone array
Sounnd source direction estimation is carried out with calculated time delay value, improves the precision of auditory localization.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other
The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For system disclosed in embodiment
For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part
It is bright.
Specific examples are used herein to describe the principles and implementation manners of the present invention, the explanation of above embodiments
Method and its core concept of the invention are merely used to help understand, described embodiment is only that a part of the invention is real
Example is applied, instead of all the embodiments, based on the embodiments of the present invention, those of ordinary skill in the art are not making creation
Property labour under the premise of every other embodiment obtained, shall fall within the protection scope of the present invention.
Claims (10)
1. a kind of sound localization method, which is characterized in that the sound localization method includes the following steps:
Four tunnel sound source voice signals are collected using quaternary microphone array;The quaternary microphone array includes four Mikes
Wind, each microphone acquire sound source voice signal all the way;
The sound source voice signal described in four tunnels synchronizes sub-frame processing, obtains frame signal set, in the signal frame set
Each frame signal includes four tunnel frame signals, respectively first via frame signal, the second tunnel frame signal, third road frame signal and the 4th tunnel
Frame signal;
The validity for judging each frame signal in the frame signal set obtains valid frame signal subset;
According to the valid frame signal subset, obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtract and repair
Positive phase transforming function transformation function;
Obtain the maximum peak that the secondary relevant average broad sense spectrum of any two-way valid frame signal fused subtracts orrection phase place transforming function transformation function
The corresponding sample point of value, obtains the time delay value of any two-way microphone sound-source signal;
According to the time delay value of the geometric position of the quaternary microphone array and any two-way microphone sound-source signal, sound source is determined
Direction position.
2. a kind of sound localization method according to claim 1, which is characterized in that the letter of the sound source voice described in four tunnels
Number sub-frame processing is synchronized, obtains frame signal set, specifically include:
Using window functionThe sound source voice signal described in four tunnels synchronizes adding window sub-frame processing, obtains
To frame signal xij(n), n indicates that n-th of sampled point, n=1,2 ..., N, N indicate frame length, xij(n) i-th of frame signal is indicated
The signal on jth road, j=1,2,3,4;
All frame signals are synthesized into frame signal set.
3. a kind of sound localization method according to claim 1, which is characterized in that in the judgement frame signal set
The validity of each frame signal obtains valid frame signal subset, specifically includes:
Utilize formulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal;Wherein, EijIt indicates
The short time frame energy of the jth road frame signal of i-th of frame signal, n indicate that n-th of sampled point, n=1,2 ..., N, N indicate frame length;
Judge whether the short time frame energy of the jth road frame signal of i-th of frame signal is greater than the first preset threshold, obtains the first judgement
As a result;
If first judging result indicates that the short time frame energy is not more than first preset threshold, the value of i is increased
1, return step " utilizes formulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal ";
If first judging result indicates that the battle array energy in short-term is greater than first preset threshold, by i-th of frame signal
It is set as starting point, the value of i is increased by 1;
Utilize formulaCalculate the zero passage of the jth road frame signal of i-th of frame signal
Rate;Wherein,
Judge whether the zero-crossing rate is greater than the second preset threshold, obtains the second judging result;
If second judging result indicates that the zero-crossing rate is greater than second preset threshold, by the jth of i-th of frame signal
The label T of road frame signalijIt is set as 1;
If described, judging result indicates that the zero-crossing rate is not more than second preset threshold, by i-th frame signal
The label T of jth road frame signalijIt is set as 0;
Utilize formula S S (i)=Ti1&&Ti2&&Ti3&&Ti4, calculate total state of the label of four tunnel frame signals of i-th of frame signal
Value SS (i);Wherein, Ti1、Ti2、Ti3And Ti4Respectively indicate the 1st tunnel, the 2nd tunnel, the 3rd road and the 4th tunnel frame signal of i-th of frame signal
Label;
Judge whether total state value SS (i) is equal to 1, obtains third judging result;
If the third judging result indicates that SS (i) is equal to 1, effective signal frame is set by i-th of signal frame;
Judge whether the short time frame energy of the jth road frame signal of i-th of frame signal is less than third predetermined threshold value, obtains the 4th judgement
As a result;
If it is pre- that the 4th judging result indicates that the short time frame energy of the jth road frame signal of i-th of frame signal is less than the third
If threshold value, then i-th of signal frame is set to the terminating point of voice signal, obtains valid frame signal subset;
If the 4th judging result indicates that the short time frame energy of the jth road frame signal of i-th of frame signal is not less than the third
The value of i is then increased by 1 by preset threshold, and return step " utilizes formulaCalculate the
The zero-crossing rate of the jth road frame signal of i frame signal ".
4. sound localization method according to claim 1, which is characterized in that it is described according to the valid frame signal subset,
It obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts orrection phase place transforming function transformation function, specifically include:
According to the valid frame signal subset, auto-correlation and cross-correlation are combined, calculate any of each effectively frame signal
The secondary correlation of two-way frame signal;
According to the valid frame signal subset, the power spectrum of each effectively every road frame signal of frame signal is calculated;
According to the power spectrum of every road frame signal, the masking by noise function of each effectively every road frame signal of frame signal is obtained:
Wherein, zpq(ω) indicates the masking by noise function of the road the q frame signal of p-th of effective frame signal, Xpq(ω) is indicated p-th
The power spectrum of the road the q frame signal of effective frame signal, q=1,2,3,4, N (ω) noise power spectrums, α indicate the first coefficient, β table
Show the second coefficient;
According to the masking by noise function of every road frame signal of each effective frame signal and the secondary correlation of any two-way frame signal,
The secondary relevant broad sense spectrum of any two-way frame signal fusion for obtaining each effectively frame signal subtracts orrection phase place transforming function transformation function:
Wherein, φls_p(ω) indicates that the road the l frame signal of p-th of effective frame signal and the fusion of the road s frame signal are secondary relevant
Broad sense, which is composed, subtracts orrection phase place transforming function transformation function, l=1,2,3,4, s=1,2,3,4, l ≠ s,
Xpl(ω) and Xps(ω) respectively indicates the power spectrum of the road the l frame signal of p-th of effective frame signal and the function of the road s frame signal
Rate spectrum, ρ indicate third coefficient;
Subtract orrection phase place transforming function transformation function according to the secondary relevant broad sense spectrum of any two-way frame signal of each effective frame signal fusion,
It obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts orrection phase place transforming function transformation function:
Wherein,Indicate that the effective frame signal in the road l and the secondary relevant average broad sense spectrum of the road s valid frame signal fused subtract
Orrection phase place transforming function transformation function, P indicate that valid frame signal subspace concentrates the quantity of effective frame signal.
5. sound localization method according to claim 1, which is characterized in that described according to the quaternary microphone array
The time delay value of geometric position and any two-way microphone sound-source signal determines the direction position of sound source, specifically includes:
According to the time delay value of the geometric position of the quaternary microphone array and any two-way microphone sound-source signal, formula is utilizedAzimuth angle theta of the calculating sound source to coordinate origin
According to the time delay value of the geometric position of the quaternary microphone array and any two-way microphone sound-source signal, formula is utilizedAzimuth pitch angle of the calculating sound source to coordinate origin
Wherein, c is the velocity of sound, and d is distance of the microphone array element to coordinate origin, τ12Indicate No. 1st microphone sound-source signal and the 2nd
The time delay value of road microphone sound-source signal, τ13Indicate the time delay of No. 1st microphone sound-source signal and No. 3rd microphone sound-source signal
Value, τ14Indicate the time delay value of No. 1st microphone sound-source signal and No. 4th microphone sound-source signal.
6. sound localization method according to claim 1, which is characterized in that the sound source voice signal described in four tunnels into
The synchronous sub-frame processing of row, obtains frame signal set, before further include:
The sound source voice signal described in every road carries out speech enhan-cement processing, obtains speech enhan-cement treated signal;
Bandpass filtering treatment is carried out to the speech enhan-cement treated signal, the signal after obtaining bandpass filtering treatment;
Wavelet threshold denoising is carried out to the signal after the bandpass filtering treatment, obtains pretreated sound source voice signal.
7. a kind of sonic location system, which is characterized in that the sonic location system includes:
Sound source voice signal obtains module, for collecting four tunnel sound source voice signals using quaternary microphone array;It is described
Quaternary microphone array includes four microphones, and each microphone acquires sound source voice signal all the way;
Framing module synchronizes sub-frame processing for the sound source voice signal described in four tunnels, obtains frame signal set, the letter
Each frame signal in number frame set includes four tunnel frame signals, respectively first via frame signal, the second tunnel frame signal, third road frame
Signal and the 4th tunnel frame signal;
Valid frame signal subset obtains module to be had for judging the validity of each frame signal in the frame signal set
Imitate frame signal subset;
It merges secondary relevant average broad sense spectrum and subtracts orrection phase place transforming function transformation function acquisition module, for according to effective frame signal
Subset obtains the secondary relevant average broad sense spectrum of any two-way valid frame signal fused and subtracts orrection phase place transforming function transformation function;
Time delay value computing module subtracts amendment phase for obtaining the secondary relevant average broad sense spectrum of any two-way valid frame signal fused
Time point corresponding to the peak-peak of bit map function obtains the time delay value of any two-way microphone sound-source signal;
Direction position determination module, for according to the quaternary microphone array geometric position and any two-way microphone sound source
The time delay value of signal determines the direction position of sound source.
8. a kind of sonic location system according to claim 7, which is characterized in that the framing module specifically includes:
Sub-frame processing submodule, for using window functionThe sound source voice signal described in four tunnels carries out
Synchronous adding window sub-frame processing, obtains frame signal xij(n), n indicates that n-th of sampled point, n=1,2 ..., N, N indicate frame length, xij
(n) signal on the jth road of i-th of frame signal of expression, j=1,2,3,4;
Submodule is synthesized, for all frame signals to be synthesized frame signal set.
9. a kind of sonic location system according to claim 7, which is characterized in that the valid frame signal subset obtains mould
Block specifically includes:
Short time frame energy balane submodule, for utilizing formulaCalculate the jth road frame signal of i-th of frame signal
Short time frame energy;Wherein, EijIndicate that the short time frame energy of the jth road frame signal of i-th of frame signal, n indicate n-th of sampling
Point, n=1,2 ..., N, N indicate frame length;
First judging submodule, for judging it is pre- whether the short time frame energy of jth road frame signal of i-th of frame signal is greater than first
If threshold value, the first judging result is obtained;
First judging result handles submodule, if indicating the short time frame energy no more than described for first judging result
The value of i is then increased by 1 by the first preset threshold, calls short time frame energy balane submodule, is executed step and " is utilized formulaCalculate the short time frame energy of the jth road frame signal of i-th of frame signal ";If first judging result indicates
The energy of battle array in short-term is greater than first preset threshold, then sets starting point for i-th of frame signal, and the value of i is increased by 1;
Zero-crossing rate computational submodule, for utilizing formulaCalculate i-th of frame letter
Number jth road frame signal zero-crossing rate;Wherein,
Second judgment submodule obtains the second judging result for judging whether the zero-crossing rate is greater than the second preset threshold;
Second judging result handles submodule, if it is pre- to indicate that the zero-crossing rate is greater than described second for second judging result
If threshold value, then by the label T of the jth road frame signal of i-th of frame signalijIt is set as 1;If described described in judging result expression
Zero-crossing rate is not more than second preset threshold, then by the label T of the jth road frame signal of i-th of frame signalijIt is set as 0;
Total state value SS (i) computational submodule, for utilizing formula S S (i)=Ti1&&Ti2&&Ti3&&Ti4, calculate i-th of frame letter
Number four tunnel frame signals label total state value SS (i);Wherein, Ti1、Ti2、Ti3And Ti4Respectively indicate the of i-th of frame signal
The label on 1 tunnel, the 2nd tunnel, the 3rd road and the 4th tunnel frame signal;
Third judging submodule obtains third judging result for judging whether total state value SS (i) is equal to 1;
Third result treatment submodule sets i-th of signal frame if indicating that SS (i) is equal to 1 for the third judging result
It is set to useful signal frame;
4th judging submodule, for judging it is pre- whether the short time frame energy of jth road frame signal of i-th of frame signal is less than third
If threshold value, the 4th judging result is obtained;
4th judging result handles submodule, if indicating the jth road frame signal of i-th of frame signal for the 4th judging result
Short time frame energy be less than the third predetermined threshold value, then set i-th of signal frame to the terminating point of voice signal, had
Imitate frame signal subset;If the 4th judging result indicates that the short time frame energy of the jth road frame signal of i-th of frame signal is not less than
The value of i is then increased by 1 by the third predetermined threshold value, calls zero-crossing rate computational submodule, is executed step and " is utilized formulaCalculate the zero-crossing rate of the jth road frame signal of i-th of frame signal ".
10. sonic location system according to claim 7, which is characterized in that the secondary relevant average broad sense of fusion
Spectrum subtracts orrection phase place transforming function transformation function and obtains module, specifically includes:
Secondary correlation computational submodule calculates every for according to valid frame signal subset, auto-correlation and cross-correlation to be combined
The secondary correlation of any two-way frame signal of a effective frame signal;
Spectra calculation submodule, for calculating every road frame signal of each effectively frame signal according to the useful signal subset
Power spectrum;
Masking by noise function acquisition submodule obtains the every of each effectively frame signal for the power spectrum according to every road frame signal
The masking by noise function of road frame signal:
Wherein, zpq(ω) indicates the masking by noise function of the road the q frame signal of p-th of effective frame signal, Xpq(ω) is indicated p-th
The power spectrum of the road the q frame signal of effective frame signal, q=1,2,3,4, N (ω) noise power spectrums, α indicate the first coefficient, β table
Show the second coefficient;
It merges secondary relevant broad sense spectrum and subtracts orrection phase place transforming function transformation function acquisition submodule, for according to each effectively frame signal
The secondary correlation of the masking by noise function of every road frame signal and any two-way frame signal obtains any of each effectively frame signal
Two-way frame signal merges secondary relevant broad sense spectrum and subtracts orrection phase place transforming function transformation function:
Wherein, φls_p(ω) indicates that the road the l frame signal of p-th of effective frame signal and the fusion of the road s frame signal are secondary relevant
Broad sense, which is composed, subtracts orrection phase place transforming function transformation function, l=1,2,3,4, s=1,2,3,4, l ≠ s,
Xpl(ω) and Xps(ω) respectively indicates the power spectrum of the road the l frame signal of p-th of effective frame signal and the function of the road s frame signal
Rate spectrum, ρ indicate third coefficient;
It merges secondary relevant average broad sense spectrum and subtracts orrection phase place transforming function transformation function acquisition submodule, for being believed according to each valid frame
Number the secondary relevant broad sense spectrum of any two-way frame signal fusion subtract orrection phase place transforming function transformation function, obtain any two-way valid frame letter
Number secondary relevant average broad sense spectrum of fusion subtracts orrection phase place transforming function transformation function:
Wherein,Indicate that the effective frame signal in the road l and the secondary relevant average broad sense spectrum of the road s valid frame signal fused subtract
Orrection phase place transforming function transformation function, P indicate that valid frame signal subspace concentrates the quantity of effective frame signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910312565.6A CN110007276B (en) | 2019-04-18 | 2019-04-18 | Sound source positioning method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910312565.6A CN110007276B (en) | 2019-04-18 | 2019-04-18 | Sound source positioning method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110007276A true CN110007276A (en) | 2019-07-12 |
CN110007276B CN110007276B (en) | 2021-01-12 |
Family
ID=67172766
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910312565.6A Active CN110007276B (en) | 2019-04-18 | 2019-04-18 | Sound source positioning method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110007276B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110703198A (en) * | 2019-10-22 | 2020-01-17 | 哈尔滨工程大学 | Quaternary cross array envelope spectrum estimation method based on frequency selection |
CN110706717A (en) * | 2019-09-06 | 2020-01-17 | 西安合谱声学科技有限公司 | Microphone array panel-based human voice detection orientation method |
Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101901602A (en) * | 2010-07-09 | 2010-12-01 | 中国科学院声学研究所 | Method for reducing noise by using hearing threshold of impaired hearing |
CN102110441A (en) * | 2010-12-22 | 2011-06-29 | 中国科学院声学研究所 | Method for generating sound masking signal based on time reversal |
CN102707262A (en) * | 2012-06-20 | 2012-10-03 | 太仓博天网络科技有限公司 | Sound localization system based on microphone array |
CN103235287A (en) * | 2013-04-17 | 2013-08-07 | 华北电力大学(保定) | Sound source localization camera shooting tracking device |
KR20130114437A (en) * | 2012-04-09 | 2013-10-17 | 주식회사 센서웨이 | The time delay estimation method based on cross-correlation and apparatus thereof |
EP2680263A1 (en) * | 2012-06-27 | 2014-01-01 | Orange | Estimation of low complexity coupling |
CN103607361A (en) * | 2013-06-05 | 2014-02-26 | 西安电子科技大学 | Time frequency overlap signal parameter estimation method under Alpha stable distribution noise |
EP2543037B1 (en) * | 2010-03-29 | 2014-03-05 | Fraunhofer Gesellschaft zur Förderung der angewandten Wissenschaft E.V. | A spatial audio processor and a method for providing spatial parameters based on an acoustic input signal |
CN104076331A (en) * | 2014-06-18 | 2014-10-01 | 南京信息工程大学 | Sound source positioning method for seven-element microphone array |
US9081083B1 (en) * | 2011-06-27 | 2015-07-14 | Amazon Technologies, Inc. | Estimation of time delay of arrival |
CN104991573A (en) * | 2015-06-25 | 2015-10-21 | 北京品创汇通科技有限公司 | Locating and tracking method and apparatus based on sound source array |
CN106098077A (en) * | 2016-07-28 | 2016-11-09 | 浙江诺尔康神经电子科技股份有限公司 | Artificial cochlea's speech processing system of a kind of band noise reduction and method |
CN106226739A (en) * | 2016-07-29 | 2016-12-14 | 太原理工大学 | Merge the double sound source localization method of Substrip analysis |
CN107102296A (en) * | 2017-04-27 | 2017-08-29 | 大连理工大学 | A kind of sonic location system based on distributed microphone array |
CN107644650A (en) * | 2017-09-29 | 2018-01-30 | 山东大学 | A kind of improvement sound localization method based on progressive serial orthogonalization blind source separation algorithm and its realize system |
US20180074163A1 (en) * | 2016-09-08 | 2018-03-15 | Nanjing Avatarmind Robot Technology Co., Ltd. | Method and system for positioning sound source by robot |
CN108198568A (en) * | 2017-12-26 | 2018-06-22 | 太原理工大学 | A kind of method and system of more auditory localizations |
CN108333575A (en) * | 2018-02-02 | 2018-07-27 | 浙江大学 | Moving sound time delay filtering method based on Gaussian prior and Operations of Interva Constraint |
US20180359563A1 (en) * | 2017-06-12 | 2018-12-13 | Ryo Tanaka | Method for accurately calculating the direction of arrival of sound at a microphone array |
-
2019
- 2019-04-18 CN CN201910312565.6A patent/CN110007276B/en active Active
Patent Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2543037B1 (en) * | 2010-03-29 | 2014-03-05 | Fraunhofer Gesellschaft zur Förderung der angewandten Wissenschaft E.V. | A spatial audio processor and a method for providing spatial parameters based on an acoustic input signal |
CN101901602A (en) * | 2010-07-09 | 2010-12-01 | 中国科学院声学研究所 | Method for reducing noise by using hearing threshold of impaired hearing |
CN102110441A (en) * | 2010-12-22 | 2011-06-29 | 中国科学院声学研究所 | Method for generating sound masking signal based on time reversal |
US9081083B1 (en) * | 2011-06-27 | 2015-07-14 | Amazon Technologies, Inc. | Estimation of time delay of arrival |
KR20130114437A (en) * | 2012-04-09 | 2013-10-17 | 주식회사 센서웨이 | The time delay estimation method based on cross-correlation and apparatus thereof |
CN102707262A (en) * | 2012-06-20 | 2012-10-03 | 太仓博天网络科技有限公司 | Sound localization system based on microphone array |
EP2680263A1 (en) * | 2012-06-27 | 2014-01-01 | Orange | Estimation of low complexity coupling |
CN103235287A (en) * | 2013-04-17 | 2013-08-07 | 华北电力大学(保定) | Sound source localization camera shooting tracking device |
CN103607361A (en) * | 2013-06-05 | 2014-02-26 | 西安电子科技大学 | Time frequency overlap signal parameter estimation method under Alpha stable distribution noise |
CN104076331A (en) * | 2014-06-18 | 2014-10-01 | 南京信息工程大学 | Sound source positioning method for seven-element microphone array |
CN104991573A (en) * | 2015-06-25 | 2015-10-21 | 北京品创汇通科技有限公司 | Locating and tracking method and apparatus based on sound source array |
CN106098077A (en) * | 2016-07-28 | 2016-11-09 | 浙江诺尔康神经电子科技股份有限公司 | Artificial cochlea's speech processing system of a kind of band noise reduction and method |
CN106226739A (en) * | 2016-07-29 | 2016-12-14 | 太原理工大学 | Merge the double sound source localization method of Substrip analysis |
US20180074163A1 (en) * | 2016-09-08 | 2018-03-15 | Nanjing Avatarmind Robot Technology Co., Ltd. | Method and system for positioning sound source by robot |
CN107102296A (en) * | 2017-04-27 | 2017-08-29 | 大连理工大学 | A kind of sonic location system based on distributed microphone array |
US20180359563A1 (en) * | 2017-06-12 | 2018-12-13 | Ryo Tanaka | Method for accurately calculating the direction of arrival of sound at a microphone array |
CN107644650A (en) * | 2017-09-29 | 2018-01-30 | 山东大学 | A kind of improvement sound localization method based on progressive serial orthogonalization blind source separation algorithm and its realize system |
CN108198568A (en) * | 2017-12-26 | 2018-06-22 | 太原理工大学 | A kind of method and system of more auditory localizations |
CN108333575A (en) * | 2018-02-02 | 2018-07-27 | 浙江大学 | Moving sound time delay filtering method based on Gaussian prior and Operations of Interva Constraint |
Non-Patent Citations (5)
Title |
---|
LIXIA HUANG 等: "Classification of Improved Cross-correlation Function to Determine Speaker Location from Microphone Array", 《 2019 IEEE FIFTH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE)》 * |
YUNMEI GONG 等: "Time delays of arrival estimation for sound source location based on coherence method in correlated noise environments", 《2010 SECOND INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS, NETWORKS AND APPLICATIONS》 * |
张传义 等: "基于广义互功率谱相位法的声源定位技术", 《东北大学学报(自然科学版)》 * |
程方晓 等: "基于改进时延估计的声源定位算法", 《吉林大学学报(理学版)》 * |
黄丽霞 等: "融合平滑滤波器和子带分析的双声源定位", 《计算机仿真》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110706717A (en) * | 2019-09-06 | 2020-01-17 | 西安合谱声学科技有限公司 | Microphone array panel-based human voice detection orientation method |
CN110706717B (en) * | 2019-09-06 | 2021-11-09 | 西安合谱声学科技有限公司 | Microphone array panel-based human voice detection orientation method |
CN110703198A (en) * | 2019-10-22 | 2020-01-17 | 哈尔滨工程大学 | Quaternary cross array envelope spectrum estimation method based on frequency selection |
CN110703198B (en) * | 2019-10-22 | 2022-03-22 | 哈尔滨工程大学 | Quaternary cross array envelope spectrum estimation method based on frequency selection |
Also Published As
Publication number | Publication date |
---|---|
CN110007276B (en) | 2021-01-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104076331B (en) | A kind of sound localization method of seven yuan of microphone arrays | |
CN107102296B (en) | Sound source positioning system based on distributed microphone array | |
CN106872944B (en) | Sound source positioning method and device based on microphone array | |
CN104459625B (en) | The sound source locating device and method of two-microphone array are moved based on track | |
CN105068048B (en) | Distributed microphone array sound localization method based on spatial sparsity | |
CN111239687B (en) | Sound source positioning method and system based on deep neural network | |
CN109839612A (en) | Sounnd source direction estimation method based on time-frequency masking and deep neural network | |
CN103308889B (en) | Passive sound source two-dimensional DOA (direction of arrival) estimation method under complex environment | |
CN110444208A (en) | A kind of speech recognition attack defense method and device based on gradient estimation and CTC algorithm | |
CN109036467B (en) | TF-LSTM-based CFFD extraction method, voice emotion recognition method and system | |
CN104991573A (en) | Locating and tracking method and apparatus based on sound source array | |
CN108877827A (en) | Voice-enhanced interaction method and system, storage medium and electronic equipment | |
CN103901401A (en) | Binaural sound source positioning method based on binaural matching filter | |
CN105204001A (en) | Sound source positioning method and system | |
CN110007276A (en) | A kind of sound localization method and system | |
CN108877809A (en) | A kind of speaker's audio recognition method and device | |
CN107167770A (en) | A kind of microphone array sound source locating device under the conditions of reverberation | |
CN110534126A (en) | A kind of auditory localization and sound enhancement method and system based on fixed beam formation | |
CN103901400B (en) | A kind of based on delay compensation and ears conforming binaural sound source of sound localization method | |
CN108896962A (en) | Iteration localization method based on sound position fingerprint | |
CN106371057B (en) | Voice sound source direction-finding method and device | |
CN107144818A (en) | Binaural sound sources localization method based on two-way ears matched filter Weighted Fusion | |
CN104535964A (en) | Helmet type microphone array sound source positioning method based on low-frequency diffraction delay inequalities | |
CN106886010A (en) | A kind of sound bearing recognition methods based on mini microphone array | |
CN109461447A (en) | A kind of end-to-end speaker's dividing method and system based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |