CN108269581B - Double-microphone time delay difference estimation method based on frequency domain coherent function - Google Patents
Double-microphone time delay difference estimation method based on frequency domain coherent function Download PDFInfo
- Publication number
- CN108269581B CN108269581B CN201710004194.6A CN201710004194A CN108269581B CN 108269581 B CN108269581 B CN 108269581B CN 201710004194 A CN201710004194 A CN 201710004194A CN 108269581 B CN108269581 B CN 108269581B
- Authority
- CN
- China
- Prior art keywords
- coherent
- function
- frequency domain
- calculating
- signals
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000001427 coherent effect Effects 0.000 title claims abstract description 70
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000009499 grossing Methods 0.000 claims abstract description 17
- 238000001228 spectrum Methods 0.000 claims description 9
- 238000007635 classification algorithm Methods 0.000 claims description 6
- 238000009432 framing Methods 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 description 19
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
- Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
Abstract
The invention discloses a double-microphone time delay difference estimation method based on a frequency domain coherent function, which comprises the following steps: step 1) calculating coherent functions of the two microphones at different azimuth angles under the condition that the two microphones do not receive sound source signals, extracting a real part and an imaginary part of the coherent functions under the coherent functions corresponding to each angle, and establishing a database of coherent function characteristic quantities; and 2) converting signals received by the double microphones into a frequency domain, calculating a coherent function of the two signals in the frequency domain, performing peak value smoothing on each frequency point according to a module value of the coherent function, then calculating a real part and an imaginary part of the coherent function, matching in a coherent function characteristic quantity database to obtain a sound source azimuth angle, and calculating time delay difference. The method effectively overcomes the interference of background noise and reverberation by extracting the frequency points with large modulus values of the coherent function frame by frame, and improves the accuracy of the estimation of the time delay difference of the double microphones.
Description
Technical Field
The invention relates to the field of voice signal delay estimation, in particular to a double-microphone delay difference estimation method based on a frequency domain coherent function.
Background
The time delay difference estimation method plays an important role in algorithms such as sound source positioning and voice enhancement of a small microphone array, and the time delay difference refers to the time difference between the same signal source received by the microphone array and caused by different transmission distances.
The known methods are based on the estimation of the delay difference by finding the peak time of the generalized cross-correlation function (reference [1 ]: Knapp C, Carter G. the generated correlation method for estimation of time delay [ J ]. IEEE Transactions on optics Specification & Signal Processing,1976,24(4): 320. 12. Liu C, Wheeler B C, Jr O W, et al. localization of multiple sources with optics. J. Journal of the optical source of America 2000, 108. 1888. 1905. reference [ 3. Jr. J. distribution S. distribution of optics. J. Journal of the optical source of America,2000, 108. 1888. J. Journal of analysis [ 10. J. Journal of analysis [ 64. J. Journal of analysis [ 1. J. Journal of analysis [ 10. J. Journal of analysis [ 1. J. Journal of analysis [ 10. J. Journal of analysis [ 10. Journal of analysis [ 10. J. Journal of analysis [ 10. J. Journal of analysis [ 10. Journal of analysis [ 10. J. Journal of analysis [ 10. Journal of analysis [ 10. J. Journal of analysis [ 10. Journal of analysis [ 10 ] 1. Journal of analysis [ 10 ] of analysis [ 10. Journal of analysis [ 10. Journal of analysis of, speech, and Signal Processing, IEEE International Conference on ICASSP. IEEE,1983: 1148-1151; reference [5 ]: extension of a organizational cross-correlation model by relational inhibition. I.A. simulation of correlation for statistical signals [ J ]. Journal of the environmental Society of America,1986,80(6):1608-22 ]. Although in quiet environments, such methods provide a good estimate of the delay difference. But has problems in that: under the interference of background noise and a reverberation environment, the accuracy of the delay inequality estimation is reduced sharply.
Disclosure of Invention
The invention aims to overcome the defects of the time delay difference estimation method in the prior art, and utilizes the characteristic that the module value of the coherent function of a direct voice section is 1, and the module values of the coherent functions of a noise and reverberation sound section are generally less than 1; and smoothing the peak value of the coherent function by using the modulus of the coherent function in the frequency domain, thereby ensuring that each frequency point of the coherent function after the peak value smoothing is dominated by direct sound. That is, the peak smoothed coherence function filters out a portion of the interference of the background noise and reverberant sound. Finally, matching the coherent function after the peak value smoothing with the coherent function under the ideal condition so as to obtain the azimuth angle corresponding to the coherent function under the closest ideal condition; finally, solving the time delay difference; thereby improving the deficiencies of the existing delay inequality estimation technology.
In order to achieve the above object, the present invention provides a method for estimating delay inequality of two microphones based on a frequency domain coherence function, the method comprising:
step 1) calculating coherent functions of the two microphones at different azimuth angles under the condition that the two microphones do not receive sound source signals, extracting a real part and an imaginary part of the coherent functions under the coherent functions corresponding to each angle, and establishing a database of coherent function characteristic quantities;
and 2) converting signals received by the double microphones into a frequency domain, calculating a coherent function of the two signals in the frequency domain, performing peak value smoothing on each frequency point according to a module value of the coherent function, then calculating a real part and an imaginary part of the coherent function, matching in a coherent function characteristic quantity database to obtain a sound source azimuth angle, and calculating time delay difference.
In the above technical solution, the step 1) specifically includes:
step 1-1) carrying out incremental increase by taking every 7.5 degrees as an interval, changing the azimuth angle of a sound source from 0 degree to 180 degrees, and calculating 25 coherence functions of the signals of the double microphones under an ideal condition;
the formula for the coherence function is:
wherein d is the distance between the two microphones; c is 340m/s, theta is the azimuth angle of the sound source, omega is the angular frequency, and fs is the sampling rate;
step 1-2) respectively extracting real parts and imaginary parts of 25 coherent functions;
step 1-3) using a K nearest KNN classification algorithm to obtain the characteristic quantity of the coherent function of each angle: and classifying the imaginary part and the real part, wherein each angle corresponds to one class, namely 25 classes in total, and the classification label is set to be 1-25, so that a coherence function characteristic quantity database is established.
In the above technical solution, the step 2) specifically includes:
step 2-1), framing and windowing signals received by the double microphones, then transforming the signals into a frequency domain through FFT, recording the signals transformed into the frequency domain as X1 (lambda, mu), wherein X2 (lambda, mu) is a time frame, and mu is a frequency point of the frequency domain;
step 2-2) calculating a coherent function of the two signals on a frequency domain;
the formula for the calculation of the coherence function is:
where PX1X1(λ, μ) is the self-power spectrum of signal X1(λ, μ), PX2X2(λ, μ) is the self-power spectrum of signal X2(λ, μ), and PX1X2(λ, μ) is the cross-power spectrum of the two signals:
PX1X1(λ,μ)=α·PX1X1((λ-1,μ))+(1-α)·|X1(λ,μ)|2
PX2X2(λ,μ)=α·PX2X2((λ-1,μ))+(1-α)·|X2(λ,μ)|2
PX1X2(λ,μ)=α·PX1X2((λ-1,μ))+(1-α)·|X1(λ,μ)·X2(λ,μ)|2
wherein α is a smoothing factor;
step 2-3) calculating a modulus value gamma of the coherence functionX1X2(lambda, mu) l, and performing peak value smoothing at each frequency point to obtain a smoothed peak value
Step 2-4) extracting in frequency domainThe real part and the imaginary part are matched with a coherent function characteristic quantity database by using a K nearest KNN classification algorithm according to the two values, and each frame of signal obtains a matching result; the matching result is the serial number of the classification label, and the corresponding azimuth angle theta is obtained0Then the delay difference Time is:
in the above technical solution, the value of the smoothing factor α in the step 2-2) is 0.68.
In the above technical solution, the specific implementation process of step 2-3) is as follows:
computing the modulus value | Γ of the coherence functionX1X2(lambda, mu) l, and acquiring the Peak value Peak (lambda, mu) of each frequency point; if | ΓX1X2(lambda, mu) is greater than | Peak (lambda, mu) |, Peak (lambda, mu) is smoothed, and the smoothed Peak valueComprises the following steps:
wherein alpha is1The value is 0.35;
if | ΓX1X2(λ, μ) | is smaller than | Peak (λ, μ) |, the smoothed Peak valueComprises the following steps:
wherein alpha is2The value is 0.95, and the initial value of Peak (lambda, mu) is equal to gammaX1X2(1, μ), the initial value is equal to the coherence function of the first frame of speech.
The invention has the advantages that: the method effectively overcomes the interference of background noise and reverberation by extracting the frequency points with large modulus values of the coherent function frame by frame, and improves the accuracy of the estimation of the time delay difference of the double microphones.
Drawings
FIG. 1 is a schematic diagram of a two-microphone delay-lag estimation scenario of the present invention;
FIG. 2 is a schematic diagram of the present invention for building an ideal frequency domain coherence function database;
FIG. 3 is a schematic representation of the real and imaginary parts of an ideal coherence function at three different azimuths;
fig. 4 is a flow chart of step 2) of the method of the invention.
Detailed Description
The invention is further described with reference to the following figures and specific embodiments.
The time delay estimation method based on the frequency domain coherent function utilizes the coherent function which obtains the frequency domain in real time to estimate the time delay difference, the method utilizes the module value of the coherent function (the module value of the coherent function of target direct sound is 1, and the module value of the coherent function of background noise and a voice segment which is greatly influenced by reverberation is smaller) to judge the reliability of frequency points, the specific method is to utilize a peak value smoothing function to obtain the reliable coherent function, and through the method, the method filters the influence of the background noise and reverberant sound to a certain extent. Therefore, the time delay difference is matched with the established ideal characteristic quantity database of the coherent function at different angles, and the matched optimal value is the current angle value, so that the time delay difference can be obtained. The method is suitable for equipment for positioning and separating the sound source of the double microphones and the like.
FIG. 1 is a schematic diagram of a two-microphone delay spread estimation algorithm, in which two microphones are at a distance d, a sound source is located at an azimuth θ between the two microphones, and the delay spread between the two microphones is
A two-microphone time delay difference estimation method based on a frequency domain coherence function comprises the following steps:
step 1) calculating coherent functions of the two microphones at different azimuth angles under the condition that the two microphones do not receive sound source signals, extracting a real part and an imaginary part of the coherent functions under the coherent functions corresponding to each angle, and establishing a database of coherent function characteristic quantities;
as shown in fig. 2, the step 1) specifically includes:
step 1-1) carrying out incremental increase by taking every 7.5 degrees as an interval, changing the azimuth angle of a sound source from 0 degree to 180 degrees, and calculating 25 coherent functions under the ideal condition of double-microphone signals;
the formula for the coherence function is:
wherein d is the distance between the two microphones; where c is 340m/s, θ is the azimuth of the sound source, ω is the angular frequency, and fs is the sampling rate, which is 16000Hz in this embodiment.
The azimuth angle of the sound source has 25 values, and 25 different coherence functions can be obtained in total.
Step 1-2) respectively extracting real parts and imaginary parts of 25 coherent functions;
the real part is: cos (ω · τ · cos (θ));
the imaginary part is sin (ω · τ · cos (θ));
from the real part and imaginary part formulas, in an ideal environment (only a target sound source, no interference of background noise and reverberation), the modulus of the coherence function is:
step 1-3) utilizing a K nearest KNN (K-nearest neighbor) classification algorithm to obtain the characteristic quantity of the coherence function of each angle: and classifying the imaginary part and the real part, wherein each angle corresponds to one class, namely 25 classes in total, and the classification label is set to be 1-25, so that a coherence function characteristic quantity database is established.
Fig. 3 shows the real and imaginary values of the ideal coherence function at three different azimuthal angles. The microphone distance in this figure is 0.255 m. As can be seen from FIG. 3, the difference between the real part and the imaginary part of the coherent function at different azimuth angles is large, and the present invention uses the characteristics of the real part and the imaginary part of the coherent function at different azimuth angles to make classification judgment.
And 2) converting signals received by the double microphones into a frequency domain, calculating a coherent function of the two signals in the frequency domain, performing peak value smoothing on each frequency point according to a module value of the coherent function, then calculating a real part and an imaginary part of the coherent function, matching in a coherent function characteristic quantity database to obtain a sound source azimuth angle, and calculating time delay difference.
As shown in fig. 4, the step 2) specifically includes:
step 2-1), framing and windowing signals received by the double microphones, then transforming the signals into a frequency domain through FFT, recording the signals transformed into the frequency domain as X1 (lambda, mu), wherein X2 (lambda, mu) is a time frame, and mu is a frequency point of the frequency domain;
in the embodiment, the received signals x1, x2 of the two microphones are subjected to framing, windowing and FFT conversion; at a sampling rate of 16000Hz, 512 samples per frame and 128 points are shifted.
Step 2-2) calculating a coherent function of the two signals on a frequency domain;
the formula for the calculation of the coherence function is:
where PX1X1(λ, μ) is the self-power spectrum of signal X1(λ, μ), PX2X2(λ, μ) is the self-power spectrum of signal X2(λ, μ), and PX1X2(λ, μ) is the cross-power spectrum of the two signals:
PX1X1(λ,μ)=α·PX1X1((λ-1,μ))+(1-α)·|X1(λ,μ)|2
PX2X2(λ,μ)=α·PX2X2((λ-1,μ))+(1-α)·|X2(λ,μ)|2
PX1X2(λ,μ)=α·PX1X2((λ-1,μ))+(1-α)·|X1(λ,μ)·X2(λ,μ)|2
wherein alpha is a smoothing factor and takes a value of 0.68.
Step 2-3) calculating a modulus value gamma of the coherence functionX1X2(λ, μ) |, and performing peak smoothing at each frequency point;
computing the modulus value | Γ of the coherence functionX1X2(lambda, mu) l, and acquiring the Peak value Peak (lambda, mu) of each frequency point; if | ΓX1X2(lambda, mu) is larger than | Peak (lambda, mu) |, the | Peak (lambda, mu) | is smoothed, and the Peak value after smoothingComprises the following steps:
α1the value is 0.35;
if | ΓX1X2(λ, μ) | is smaller than | Peak (λ, μ) |, the smoothed Peak valueComprises the following steps:
wherein alpha is2The value is 0.95, and the initial value of Peak (lambda, mu) is equal to gammaX1X2(1, μ), the initial value is equal to the coherence function of the first frame of speech.
Step 2-4) extracting in frequency domainThe real part and the imaginary part are matched with a coherent function characteristic quantity database by using a K nearest KNN classification algorithm according to the two values, and each frame of signal obtains a matching result; the matching result is a classification labelFrom the number of (2), the corresponding azimuth angle theta can be obtained0Then the delay difference is:
the invention fully utilizes the characteristics of a real part and an imaginary part of a frequency domain coherent function to construct a characteristic vector to classify sound sources with different azimuth angles, and the algorithm realizes the azimuth judgment of each frame of voice signals through off-line modeling and on-line prediction, namely delay difference estimation. Meanwhile, the algorithm considers the interference problem under the background noise and reverberation environment, utilizes the module value of the frequency domain coherent function to smooth the peak value of the coherent function, and effectively ensures the effectiveness of the characteristic vector at the online prediction stage. The algorithm is clear in idea, simple and effective. The method is convenient to realize in real time in devices such as microphone arrays and the like.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (1)
1. A two-microphone time delay difference estimation method based on a frequency domain coherence function, the method comprising:
step 1) calculating coherent functions of the two microphones at different azimuth angles under the condition that the two microphones do not receive sound source signals, extracting a real part and an imaginary part of the coherent functions under the coherent functions corresponding to each angle, and establishing a database of coherent function characteristic quantities;
step 2) converting signals received by the double microphones into a frequency domain, calculating a coherent function of the two signals on the frequency domain, performing peak value smoothing on each frequency point according to a module value of the coherent function, then calculating a real part and an imaginary part of the coherent function, matching in a coherent function characteristic quantity database to obtain a sound source azimuth angle, and calculating a time delay difference;
the step 1) specifically comprises the following steps:
step 1-1) carrying out incremental increase by taking every 7.5 degrees as an interval, changing the azimuth angle of a sound source from 0 degree to 180 degrees, and calculating 25 coherence functions of the signals of the double microphones under an ideal condition;
the formula for the coherence function is:
wherein d is the distance between the two microphones; c is 340m/s, theta is the azimuth angle of the sound source, omega is the angular frequency, and fs is the sampling rate;
step 1-2) respectively extracting real parts and imaginary parts of 25 coherent functions;
step 1-3) using a K nearest KNN classification algorithm to obtain the characteristic quantity of the coherent function of each angle: classifying the imaginary part and the real part, wherein each angle corresponds to one class, namely 25 classes in total, and the classification label is set to be 1-25, so that a coherence function characteristic quantity database is established;
the step 2) specifically comprises the following steps:
step 2-1), framing and windowing signals received by the double microphones, then transforming the signals into a frequency domain through FFT, recording the signals transformed into the frequency domain as X1 (lambda, mu), wherein X2 (lambda, mu) is a time frame, and mu is a frequency point of the frequency domain;
step 2-2) calculating a coherent function of the two signals on a frequency domain;
the formula for the calculation of the coherence function is:
where PX1X1(λ, μ) is the self-power spectrum of signal X1(λ, μ), PX2X2(λ, μ) is the self-power spectrum of signal X2(λ, μ), and PX1X2(λ, μ) is the cross-power spectrum of the two signals:
PX1X1(λ,μ)=α·PX1X1((λ-1,μ))+(1-α)·|X1(λ,μ)|2
PX2X2(λ,μ)=α·PX2X2((λ-1,μ))+(1-α)·|X2(λ,μ)|2
PX1X2(λ,μ)=α·PX1X2((λ-1,μ))+(1-α)·|X1(λ,μ)·X2(λ,μ)|2
wherein α is a smoothing factor;
step 2-3) calculating a modulus value gamma of the coherence functionX1X2(lambda, mu) l, and performing peak value smoothing at each frequency point to obtain a smoothed peak value
Step 2-4) extracting in frequency domainThe real part and the imaginary part are matched with a coherent function characteristic quantity database by using a K nearest KNN classification algorithm according to the two values, and each frame of signal obtains a matching result; the matching result is the serial number of the classification label, and the corresponding azimuth angle theta is obtained0Then the delay difference Time is:
the value of the smoothing factor alpha in the step 2-2) is 0.68;
the specific implementation process of the step 2-3) is as follows:
computing the modulus value | Γ of the coherence functionX1X2(lambda, mu) l, and acquiring the Peak value Peak (lambda, mu) of each frequency point; if | ΓX1X2(lambda, mu) is greater than | Peak (lambda, mu) |, Peak (lambda, mu) is smoothed, and the smoothed Peak valueComprises the following steps:
wherein alpha is1The value is 0.35;
if | ΓX1X2(λ, μ) | is smaller than | Peak (λ, μ) |, the smoothed Peak valueComprises the following steps:
wherein alpha is2The value is 0.95, and the initial value of Peak (lambda, mu) is equal to gammaX1X2(1, μ), the initial value is equal to the coherence function of the first frame of speech.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710004194.6A CN108269581B (en) | 2017-01-04 | 2017-01-04 | Double-microphone time delay difference estimation method based on frequency domain coherent function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710004194.6A CN108269581B (en) | 2017-01-04 | 2017-01-04 | Double-microphone time delay difference estimation method based on frequency domain coherent function |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108269581A CN108269581A (en) | 2018-07-10 |
CN108269581B true CN108269581B (en) | 2021-06-08 |
Family
ID=62771665
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710004194.6A Active CN108269581B (en) | 2017-01-04 | 2017-01-04 | Double-microphone time delay difference estimation method based on frequency domain coherent function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108269581B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112992176B (en) * | 2020-09-30 | 2022-07-19 | 北京海兰信数据科技股份有限公司 | Ship acoustic signal identification method and device |
CN112526452B (en) * | 2020-11-24 | 2024-08-06 | 杭州萤石软件有限公司 | Sound source detection method, pan-tilt camera, intelligent robot and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101494522A (en) * | 2008-12-30 | 2009-07-29 | 清华大学 | Method for eliminating wireless signal interference based on network encode |
CN102854494A (en) * | 2012-08-08 | 2013-01-02 | Tcl集团股份有限公司 | Sound source locating method and device |
CN103076593A (en) * | 2012-12-28 | 2013-05-01 | 中国科学院声学研究所 | Sound source localization method and device |
CN104101871A (en) * | 2013-04-15 | 2014-10-15 | 中国科学院声学研究所 | Narrowband interference suppression method and narrowband interference suppression system used for passive synthetic aperture |
JP2016156944A (en) * | 2015-02-24 | 2016-09-01 | 日本電信電話株式会社 | Model estimation device, target sound enhancement device, model estimation method, and model estimation program |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9354310B2 (en) * | 2011-03-03 | 2016-05-31 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for source localization using audible sound and ultrasound |
-
2017
- 2017-01-04 CN CN201710004194.6A patent/CN108269581B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101494522A (en) * | 2008-12-30 | 2009-07-29 | 清华大学 | Method for eliminating wireless signal interference based on network encode |
CN102854494A (en) * | 2012-08-08 | 2013-01-02 | Tcl集团股份有限公司 | Sound source locating method and device |
CN103076593A (en) * | 2012-12-28 | 2013-05-01 | 中国科学院声学研究所 | Sound source localization method and device |
CN104101871A (en) * | 2013-04-15 | 2014-10-15 | 中国科学院声学研究所 | Narrowband interference suppression method and narrowband interference suppression system used for passive synthetic aperture |
JP2016156944A (en) * | 2015-02-24 | 2016-09-01 | 日本電信電話株式会社 | Model estimation device, target sound enhancement device, model estimation method, and model estimation program |
Non-Patent Citations (4)
Title |
---|
A BINAURAL SOUND SOURCE LOCALIZATION MODEL BASED ON TIME-DELAY COMPENSATION AND INTERAURAL COHERENCE;Hong Liu, Jie Zhang;《ICASSP》;20141231;第1424-1428页 * |
A binaural speech enhancement algorithm: Application to background and directional noise fields;Yi Fang, Youyuan Chen, Haihong Feng;《CISP 2015》;20151231;第1261-1265页 * |
一种抑制方向性噪声的双耳语音増强算法;方义,冯海泓,陈友元,胡晓城;《声学学报》;20161130;第41卷(第6期);第897-904页 * |
一种频域自适应最大似然时延估计算法;陈华伟,赵俊渭,郭业才;《系统工程与电子技术》;20031130;第25卷(第11期);第1355-1361页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108269581A (en) | 2018-07-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106251877B (en) | Voice Sounnd source direction estimation method and device | |
CN108731886B (en) | A kind of more leakage point acoustic fix ranging methods of water supply line based on iteration recursion | |
WO2019080551A1 (en) | Target voice detection method and apparatus | |
CN111429939B (en) | Sound signal separation method of double sound sources and pickup | |
CN110534126B (en) | Sound source positioning and voice enhancement method and system based on fixed beam forming | |
CN102411138A (en) | Method for positioning sound source by robot | |
Pavlidi et al. | Real-time multiple sound source localization using a circular microphone array based on single-source confidence measures | |
CN111239687A (en) | Sound source positioning method and system based on deep neural network | |
CN111044973A (en) | MVDR target sound source directional pickup method for microphone matrix | |
CN109188362A (en) | A kind of microphone array auditory localization signal processing method | |
Ren et al. | A novel multiple sparse source localization using triangular pyramid microphone array | |
CN103901400B (en) | A kind of based on delay compensation and ears conforming binaural sound source of sound localization method | |
CN109212481A (en) | A method of auditory localization is carried out using microphone array | |
CN108269581B (en) | Double-microphone time delay difference estimation method based on frequency domain coherent function | |
Imran et al. | A methodology for sound source localization and tracking: Development of 3D microphone array for near-field and far-field applications | |
Wang et al. | Pseudo-determined blind source separation for ad-hoc microphone networks | |
Hu et al. | Decoupled direction-of-arrival estimations using relative harmonic coefficients | |
WO2022042864A1 (en) | Method and apparatus for measuring directions of arrival of multiple sound sources | |
Hu et al. | Evaluation and comparison of three source direction-of-arrival estimators using relative harmonic coefficients | |
Grondin et al. | A study of the complexity and accuracy of direction of arrival estimation methods based on GCC-PHAT for a pair of close microphones | |
Nikunen et al. | Time-difference of arrival model for spherical microphone arrays and application to direction of arrival estimation | |
Firoozabadi et al. | Combination of nested microphone array and subband processing for multiple simultaneous speaker localization | |
Dang et al. | Multiple sound source localization based on a multi-dimensional assignment model | |
Deleforge et al. | Audio-motor integration for robot audition | |
Sledevič et al. | An evaluation of hardware-software design for sound source localization based on SoC |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |