CN111157949A - Voice recognition and sound source positioning method - Google Patents

Voice recognition and sound source positioning method Download PDF

Info

Publication number
CN111157949A
CN111157949A CN201811326998.9A CN201811326998A CN111157949A CN 111157949 A CN111157949 A CN 111157949A CN 201811326998 A CN201811326998 A CN 201811326998A CN 111157949 A CN111157949 A CN 111157949A
Authority
CN
China
Prior art keywords
sound source
array
signal
microphone
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811326998.9A
Other languages
Chinese (zh)
Inventor
张梦巧
王洁莹
张喜明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Changfeng Science Technology Industry Group Corp
Original Assignee
China Changfeng Science Technology Industry Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Changfeng Science Technology Industry Group Corp filed Critical China Changfeng Science Technology Industry Group Corp
Priority to CN201811326998.9A priority Critical patent/CN111157949A/en
Publication of CN111157949A publication Critical patent/CN111157949A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • G01S5/20Position of source determined by a plurality of spaced direction-finders

Abstract

The invention provides a voice recognition and sound source positioning method, which comprises time delay estimation and sound source positioning, wherein firstly, the relative time difference of sound source signals reaching microphone elements in an array is estimated through an algorithm; and secondly, calculating the distance difference of the sound source to each array element by using the estimated time difference, and determining the position of the sound source by combining the array topological structure through a geometric algorithm or search.

Description

Voice recognition and sound source positioning method
Technical Field
The invention relates to the field of computer signal processing, in particular to a voice recognition and sound source positioning method.
Background
Since the 80's of the 20 th century, microphone array signal processing techniques have evolved dramatically and have found widespread use in radar, sonar, and communications. This array signal processing idea is later applied to speech signal processing. The international use of microphone array systems for speech signal processing has been studied since 1970. In 1976, Gabfid applied adaptive beamforming techniques in radar and sonar directly to the simple sound acquisition problem. In 1985, Flanagan of AT & T/Bell laboratories in the United states used 21 microphones to form an existing array, and the acquisition of sound source signals was first achieved in an electronic control manner. In the same year, Flanagan et al applied a two-dimensional microphone array to sound pickup in large rooms to suppress the effects of reverberation and noise on the sound source signal. Due to the restriction of the technology at that time, the algorithm cannot be realized in a digital mode by means of a digital signal processing technology, but mainly adopts an analog device, in 1991, Kellermann realizes the algorithm in a full-digital mode by means of the digital signal processing technology, so that the performance of the algorithm is further improved, the hardware cost is reduced, and the flexibility of the system is improved. Microphone array systems have subsequently been used in many applications, including video conferencing, speech recognition, speaker recognition, automotive environment speech acquisition, reverberant environment sound pickup, sound source localization, and hearing aid devices, among others. Currently, a speech processing technology based on a microphone array is becoming a new research hotspot, but the related application technology is not mature.
Disclosure of Invention
The invention aims to provide a voice recognition and sound source positioning method which is expected to be applied to the fields of voice recognition, voice acquisition in a strong noise environment, conference recording in a large place, sound detection, hearing aid devices and the like.
In order to achieve the purpose, the invention adopts the following technical scheme:
a speech recognition and sound source localization method comprises time delay estimation and sound source localization, and is characterized in that: firstly, estimating the relative time difference of sound source signals arriving at microphone elements in an array through an algorithm; and secondly, calculating the distance difference of the sound source to each array element by using the estimated time difference, and determining the position of the sound source by combining the array topological structure through a geometric algorithm or search.
The specific method for estimating the time delay comprises the following steps: assuming that only a unique sound source exists, the microphones are arranged in a uniform linear array, a sound source signal s (k) to be positioned exists in a far-field environment, the first microphone element is selected as a reference point, and the signal received by the nth array element at the moment k is represented as:
yn(k)=αns(k-t-τn1)+vn(k)
=αns[k-t-Fn(τ)]+vn(k)
=xn(k)+vn(k),n=1,2,…,N
α thereinn(N ═ l, 2, …, N) is the attenuation of the signal during propagation, and has a value between [0, 1%]To (c) to (d); t represents the propagation of the signal from s (k) to the array element No. 1The propagation time of (c); v. ofn(k) Representing the additive noise received at the nth array element; tau represents the time delay difference of signals received by the microphone element I and the microphone element 2; fnThe (τ) function represents the signal delay between the nth and first array elements.
The specific method for positioning the sound source comprises the following steps: and determining the direction angle and the distance of the sound source according to the geometrical relationship between the sound source and the array.
The present invention can be practically applied to the following fields: video conference, the sound source positioning technology can track and position speakers in the video conference; the robot technology realizes the positioning and tracking of a sound source by a robot by utilizing a double-ear time delay model and cross-correlation operation; noise detection, in order to better control the noise in engines and large-scale instruments such as automobiles and motorcycles, a sound source positioning technology is an important method for evaluating the performance of the engines and testing the stability of large-scale machinery; in medical equipment, a sound source positioning technology can be used for analyzing a lesion part, and diagnosis of diseases plays a great promoting role.
Drawings
Fig. 1 is a schematic diagram of sound source localization of the present invention.
Detailed Description
The sound source localization method of the present invention is generally divided into two steps, namely, delay estimation and sound source localization. Firstly, estimating the relative time difference of sound source signals reaching microphone elements in an array through an algorithm; and secondly, calculating the distance difference of the sound source to each array element by using the estimated time difference, and determining the position of the sound source by combining the array topological structure through a geometric algorithm or search.
1. Delay estimation
The geometric shape of the array is crucial to sound source positioning performance, and according to the environment where the microphone array is located, a model for time delay estimation can be divided into an ideal model and a reverberation model. We refer to the model of a microphone element that receives only sound signals arriving at the microphone array via a direct path as an ideal model. Such a model considering not only signals arriving through a direct path but also signals arriving indirectly at the array after the signals emitted from the sound source encounter reflections from walls, tables, etc. is called a reverberation model. Because the number of the paths of the reverberation signal has uncertainty, the algorithm complexity based on the reverberation model is relatively larger than that of an ideal model, and the algorithm based on the reverberation model is used for fitting the influence of interference by a mathematical model and is not like the influence of avoiding the interference of an indirect path signal by the ideal model, so the time delay estimation effect of the algorithm based on the reverberation model is relatively good. Nevertheless, in order to reduce the complexity of the algorithm, the present invention mainly studies the delay estimation of the microphone array with respect to an ideal model.
Assuming only a single sound source, the microphone array is a uniform linear array. In a far-field environment, there is a sound source signal s (k) to be located, and if we select the first microphone element as the reference point, the signal received by the nth array element at time k can be expressed as:
yn(k)=αns(k-t-τn1)+vn(k)
=αns[k-t-Fn(τ)]+vn(k)
=xn(k)+vn(k),n=1,2,…,N
α thereinn(N ═ l, 2, …, N) is the attenuation of the signal during propagation, and has a value between [0, 1%]In the meantime. t represents the propagation time between the signal propagating from s (k) to array element number 1. v. ofn(k) Representing the additive noise received at the nth array element. It is assumed that the noise is uncorrelated with the speech signal and with the noise signal of other elements. τ (note) represents the time delay difference of the signals received by the microphone element I and the microphone element 2. FnThe (τ) function represents the signal delay between the nth and first array elements. It is assumed here that the microphone array model used is a uniform linear array located in a far-field environment, and then:
F1(τ)=0,F2(τ)=τ,Fn(τ)=(n-1)τ,n=2,…,N
in the near field, the signal arrives at the microphone array in the form of spherical waves, so FnIs a non-linear function of τ. At this time FnBoth in relation to the microphone element spacing and in relation to the position of the sound source signal relative to the array. For uniform linear arrays, FnThe function is known, so the problem of time delay estimation is equivalent to the problem of estimating tau, and the time delay estimation algorithm is used for calculating the multi-channel sound signal of the collected limited frame
Figure BDA0001858994480000031
2. Sound source localization
After the time delay of the microphone array is estimated, the direction angle and the distance of the sound source can be determined according to the geometric relationship between the sound source and the array, but the positioning accuracy is influenced by a plurality of factors, wherein the main factors influencing the positioning accuracy are a time delay estimation method and a positioning method. The present technique employs an improved sound source localization algorithm, considering the sound source as a point sound source and assuming the sound source is at infinity, then the wavefront is perpendicular to the wave front. The time sequence of the signals received by the microphones A and B is shown in FIG. 1, where L is the distance between two microphone elements, c is the speed of sound propagating in the air, and τ isABIs the time difference between the sound source and the two microphones, i.e. the time delay between the array elements, and theta is the direction angle of the sound source.

Claims (3)

1. A speech recognition and sound source localization method comprises time delay estimation and sound source localization, and is characterized in that: firstly, estimating the relative time difference of sound source signals arriving at microphone elements in an array through an algorithm; and secondly, calculating the distance difference of the sound source to each array element by using the estimated time difference, and determining the position of the sound source by combining the array topological structure through a geometric algorithm or search.
2. The method of claim 1, wherein the time delay estimation comprises: assuming that only a unique sound source exists, the microphones are arranged in a uniform linear array, a sound source signal s (k) to be positioned exists in a far-field environment, the first microphone element is selected as a reference point, and the signal received by the nth array element at the moment k is represented as:
yn(k)=αns(k-t-τn1)+vn(k)
=αns[k-t-Fn(τ)]+vn(k)
=xn(k)+vn(k),n=1,2,···,N
α thereinn(N ═ l, 2, …, N) is the attenuation of the signal during propagation, and has a value between [0, 1%]To (c) to (d); t represents the propagation time of the signal from s (k) to array element number 1; v. ofn(k) Representing the additive noise received at the nth array element; tau represents the time delay difference of signals received by the microphone element I and the microphone element 2; fnThe (τ) function represents the signal delay between the nth and first array elements.
3. The method of claim 1, wherein the sound source is located by: and determining the direction angle and the distance of the sound source according to the geometrical relationship between the sound source and the array.
CN201811326998.9A 2018-11-08 2018-11-08 Voice recognition and sound source positioning method Pending CN111157949A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811326998.9A CN111157949A (en) 2018-11-08 2018-11-08 Voice recognition and sound source positioning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811326998.9A CN111157949A (en) 2018-11-08 2018-11-08 Voice recognition and sound source positioning method

Publications (1)

Publication Number Publication Date
CN111157949A true CN111157949A (en) 2020-05-15

Family

ID=70555103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811326998.9A Pending CN111157949A (en) 2018-11-08 2018-11-08 Voice recognition and sound source positioning method

Country Status (1)

Country Link
CN (1) CN111157949A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114034380A (en) * 2021-11-11 2022-02-11 上汽大众汽车有限公司 One-dimensional acoustic positioning method for engine pedestal

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114034380A (en) * 2021-11-11 2022-02-11 上汽大众汽车有限公司 One-dimensional acoustic positioning method for engine pedestal
CN114034380B (en) * 2021-11-11 2023-11-17 上汽大众汽车有限公司 One-dimensional acoustic positioning method for engine rack

Similar Documents

Publication Publication Date Title
Brandstein et al. A practical time-delay estimator for localizing speech sources with a microphone array
CN111044973B (en) MVDR target sound source directional pickup method for microphone matrix
CN106448722A (en) Sound recording method, device and system
CN101762806B (en) Sound source locating method and apparatus thereof
CN110534126B (en) Sound source positioning and voice enhancement method and system based on fixed beam forming
CN102324237A (en) Microphone array voice wave beam formation method, speech signal processing device and system
EP1899954A1 (en) System and method for extracting acoustic signals from signals emitted by a plurality of sources
KR20120059827A (en) Apparatus for multiple sound source localization and method the same
CN103117064A (en) Processing signals
CN109669159A (en) Auditory localization tracking device and method based on microphone partition ring array
CN105607042A (en) Method for locating sound source through microphone array time delay estimation
CN106992010B (en) Microphone array speech enhancement device under condition of no direct sound
CN109212481A (en) A method of auditory localization is carried out using microphone array
CN109188362A (en) A kind of microphone array auditory localization signal processing method
CN107167770A (en) A kind of microphone array sound source locating device under the conditions of reverberation
CN112363112B (en) Sound source positioning method and device based on linear microphone array
KR20090128221A (en) Method for sound source localization and system thereof
US20130253923A1 (en) Multichannel enhancement system for preserving spatial cues
Klein et al. Direction-of-arrival estimation using a microphone array with the multichannel cross-correlation method
Wan et al. Improved steered response power method for sound source localization based on principal eigenvector
CN111157949A (en) Voice recognition and sound source positioning method
CN110927668A (en) Sound source positioning optimization method of cube microphone array based on particle swarm
Himawan et al. Clustering of ad-hoc microphone arrays for robust blind beamforming
CN109061567B (en) Voice accurate positioning method under multi-source environment
Tervo et al. Interpolation methods for the SRP-PHAT algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200515