CN100489962C

CN100489962C - Sound direction recognition apparatus and method

Info

Publication number: CN100489962C
Application number: CNB031310036A
Authority: CN
Inventors: 罗立声
Original assignee: Sunplus Technology Co Ltd
Current assignee: Lingtong Technology Co., Ltd.
Priority date: 2003-05-14
Filing date: 2003-05-14
Publication date: 2009-05-20
Anticipated expiration: 2023-05-14
Also published as: CN1549242A

Abstract

The present invention relates to the device and method of distinguishing the direction of sound. Several sound source searching units are used to receive several sound wave signals, amplify and filter the sound wave signals and convert the sound wave signals to obtain several converted pulse signals. The converted pulse signals are sampled in a processing unit to obtain several sample signal sequences, from which several time differences are found out and further used in lookup table to obtain corresponding sound source positions.

Description

Audio direction recognition device and method

Technical field

The invention relates to the technical field of audio direction identification, refer to a kind of audio direction recognition device and method especially.

Background technology

Fig. 1 shows that the human body ears receive the synoptic diagram of sound, and wherein, source of sound is produced by position A, and its sound wave arrives at left ear 11 and auris dextra 12 with priority, thereby produces a mistiming, and brain is then according to the source direction of this mistiming sound recognition.In real world, mainly utilize microphone to come radio reception, and come radio reception and sound recognition direction by two above microphones.Usually, utilize two non-directive microphone identification sound source directions that its restriction is arranged, promptly these two non-directive microphones only can discern the source of sound of both sides, the left and right sides and can't discern before and after source of sound, if will carry out the microphone that the identification of front and back audio direction then needs comparatively complicated algorithm or makes apparatus directive property, so when carrying out audio direction identification, mostly utilize three non-directive microphones to do the radio reception of 360 degree usually.

The technology of audio direction identification at present mainly contains two kinds.First kind is crest detection method (PeakDetection Method), its mainly to the sound wave that microphone receives amplify, filtering and Integral Processing, so that sound wave becomes similar triangular wave, then find out the corresponding triangular wave peak value of each microphone (Peak), and compare those peak values, to obtain the mistiming, utilize the mathematical operation formula at last again

ΔT = \frac{aθ + a \sin θ}{c}

(middle c is that the velocity of sound, Δ T are the mistiming) and mistiming and incident angle conversion synoptic diagram shown in Figure 2, and obtain the sound incident angle, to obtain the sound source position of sound wave.

Second kind is correlation method (Cross-correlation Method), its sound wave that mainly each microphone is received is through after the suitable amplification and Filtering Processing, convert numerical data to by analog-digital converter (ADC) again, for the corresponding numerical data of those different microphones is carried out the correlation computing, obtaining maximum related value (being the mistiming), and find out the sound incident angle according to this maximum related value.

Yet above-mentioned two kinds of methods all need to use ADC, make cost up.In addition, general microphone is a condenser type, because the equivalent capacitance value of each microphone is inequality, cause passing of time to move (Shift) and influence direction and judge, and above-mentioned correlation method needs very long numerical data string such as is added up at computing, not only the data computation amount is huge, even need use multiplication.Therefore, how to design and a kind ofly be not subjected to the influence of Electret Condencer Microphone difference, must do not use ADC and do not need the audio direction recognition device of too huge calculating to become the problem of needing solution badly.

Summary of the invention

Fundamental purpose of the present invention is to provide a kind of audio direction recognition device and method, can not need use analog-digital converter (ADC), is not subjected to the influence of Electret Condencer Microphone difference, does not use multiplication and must huge calculating and the direction of coming sound recognition.

According to a characteristic of the present invention, the audio direction recognition device that is provided comprises:

A plurality of sources of sound are searched the unit, each source of sound is searched the unit and is received an acoustic signals, and this acoustic signals is amplified and Filtering Processing, to obtain an acoustic signals that amplifies, then the acoustic signals to this amplification carries out signal conversion processes, to obtain a commutation pulse signal; And

One processing unit, searching the unit with those sources of sound is connected, take a sample for the commutation pulse signal of those sources of sound being searched unit output, to obtain plural sampled signal row, then be listed as by those sampled signals with a maximum similar value method and obtain the plural number mistiming, obtain the sound source location of those acoustic signals to table look-up by those mistimings;

Wherein, those sampled signals row be with

{\overset{&RightArrow;}{x}}_{1}, {\overset{&RightArrow;}{x}}_{2}, {\overset{&RightArrow;}{x}}_{3} &Element; {1,0}

Expression, this processing unit is L to the sample length of those sampled signal row, and those mistimings are with Δ ₁, Δ ₂, Δ ₃Expression, in the middle of, Δ ₁For

With

Mistiming, Δ ₂For With Mistiming, Δ ₃For

With

Mistiming; This maximum similar value method of reaching is to operate according to letter formula L (a|x)=f (x|a), in the middle of, a is the element of A, x is the element of S, and if a is a Δ ₁, then x is

(n) and

(n+ Δ ₁) inner product, if a is Δ ₂, then x is

(n) and (n+ Δ ₂) inner product, if a is Δ ₃, then x is

(n) and

(n+ Δ ₃) inner product, A be possible mistiming and A ∈ 0, Δ _{Possible max}, Δ _{Possible max}Be Δ ₁, Δ ₂With Δ ₃Middle the maximum, S ∈ 1,0}, and then find out those mistimings, and make opposing L (a|x)=f (x|a) maximize.

Described audio direction recognition device, wherein each source of sound search unit also comprises a pre-amplifier and a signal detection device, for this acoustic signals being converted to the commutation pulse signal with high-end trim and low-end trim.

Described audio direction recognition device, wherein each source of sound search unit also comprises a radio reception device and a back grade filter amplifier, this pre-amplifier is connected with this radio reception device and this back level filter amplifier respectively, and this signal detection device is connected with this post-amplifier and this processing unit respectively.

Described audio direction recognition device, wherein each pre-amplifier is as driver with a two-carrier transistor.

Described audio direction recognition device, wherein this two-carrier transistor is a NPN transistor.

Described audio direction recognition device, wherein each signal detection device be zero hand over more detector (ZeroCrossing Detector, ZCD).

Described audio direction recognition device, wherein this processing unit is tabled look-up by an incident angle corresponding tables, this incident angle corresponding tables has plural Preset Time difference and plural incident angle value, being provided with those mistimings compares those Preset Time differences and tries to achieve corresponding incident angle value, to obtain the sound source location of those acoustic signals.

According to another characteristic of the present invention, the audio direction recognition methods that is provided comprises the steps:

One identification parameter is set step, in order to set a sample length parameter and an identification number of times parameter;

One acoustic signals switch process receives plural acoustic signals, and converts thereof into the complex conversion pulse signal;

One sampling procedure comes those commutation pulse signals are taken a sample according to this sample length parameter, and obtains the plural number mistiming by a maximum similar value method; And

One step of tabling look-up contrasts an incident angle corresponding tables with those mistimings, to obtain plural acoustic signals incident angle, for the sound source position of being obtained those acoustic signals by those acoustic signals incident angles.

Described audio direction recognition methods, also comprise an average step, after obtaining those acoustic signals incident angles with the step of tabling look-up when this, temporary those acoustic signals incident angles, and carry out plural number time this sampling procedure and this action of tabling look-up according to this identification number of times parameter, for obtaining those acoustic signals incident angles of plural groups, to average processing;

Should maximum similar value method be wherein according to following letter formula running: L (a|x)=f (x|a), a is the element of A, x is the element of S, and if a is a Δ ₁, then

x = {\overset{&RightArrow;}{x}}_{1} (n) \cdot {\overset{&RightArrow;}{x}}_{2} (n + Δ_{1}),

If a is a Δ ₂, then

x = {\overset{&RightArrow;}{x}}_{2} (n) \cdot {\overset{&RightArrow;}{x}}_{3} (n + Δ_{2}),

If a is a Δ ₃, then

x = {\overset{&RightArrow;}{x}}_{3} (n) \cdot {\overset{&RightArrow;}{x}}_{1} (n + Δ_{3}),

A be possible mistiming and A ∈ 0, Δ _{Possible max}, Δ _{Possible max}Be Δ ₁, Δ ₂, Δ ₃In the maximum, S ∈ 1,0} for finding out those mistimings, and makes opposing L (a|x)=f (x|a) maximization, in the middle of,

\overset{&RightArrow;}{x} 1, \overset{&RightArrow;}{x} 2, \overset{&RightArrow;}{x} 3 &Element; {1,0}

Be the sampled signal row of those commutation pulse signals, L is this sample length parameter,

Be those mistimings, Δ 1 is

1 He

Mistiming, Δ 2 is

2 Hes

Mistiming, Δ 3 is

With Mistiming.

Described audio direction recognition methods, average aforementioned processing before, get rid of earlier very big or minimum incident angle.

Description of drawings

Fig. 1 ear of behaving receives the synoptic diagram of sound.

Fig. 2 is mistiming and incident angle conversion synoptic diagram.

Fig. 3 is the functional block diagram of a preferred embodiment of the present invention.

Fig. 4 is the circuit diagram that the source of sound of a preferred embodiment of the present invention is searched the unit.

Fig. 5 is the microphone ornaments synoptic diagram of a preferred embodiment of the present invention.

Fig. 6 is the action flow chart of a preferred embodiment of the present invention.

Embodiment

Relevant preferred embodiment of the present invention please refer to the functional block diagram that Fig. 3 shows, it mainly searches unit 31 by three groups of sources of sound, 32,33 form with processing unit 34, wherein, each source of sound is searched unit 31,32,33 and is had microphone 311 respectively, 321,331, pre-amplifier 312,322, and 332, back utmost point filter amplifier 313,323,333 and signal detection device 314,324,334.

Each source of sound is searched unit 31,32,33 output terminal all is connected with the input end of processing unit 34, for passing through microphone 311,321,331 receive plural acoustic signals, and convert those acoustic signals to the complex conversion pulse signal, carry out audio direction identification processing to export processing unit 34 to.Microphone 311,321,331 output terminal and pre-amplifier 312,322,332 input end is connected, pre-amplifier 312,322,332 output terminal then is connected with the input end of back utmost point filter amplifier 313,323,333, back utmost point filter amplifier 313,323,333 output terminal then is connected to those signal detection devices 314,324,334.

In present embodiment, pre-amplifier 312,322,332 adopt the two-carrier transistor as driver, for example are employing NPN two-carrier transistor, pass for the time of utilizing the Control current mode to avoid known Electret Condencer Microphone to produce and move (Shift) influence, and can be with microphone 311, the acoustic signals of 321,331 radio reception carries out prime and amplifies, and shows with the feature with acoustic signals.In present embodiment, signal detection device 314,324,334 is preferably zero and hands over detector (Zero CrossingDetector more, ZCD),, those acoustic signals produce commutation pulse signal (being zero-crossing signal) for being detected with high-end trim and low-end trim.

Aforesaid source of sound is searched the electronic component of unit 31,32,33 known to can be generally and is realized, Fig. 4 promptly shows source of sound search unit 31, a kind of circuit diagram of 32,33, Fig. 5 shows those microphones 311 in addition, 321,331 the ornaments location drawing, wherein, those microphones 311,321,331 generally lay respectively at three summits of an equilateral triangle.And how relevant processing unit 34 carries out identification to voice signal, will be explained orally in following.

Fig. 6 shows the process flow diagram of audio direction recognition methods of the present invention, at first, number of times that setting is desired to discern (N) and sample length (L) (step S601), now, begin by microphone 311,321,331 receive the plural acoustic signals that source of sound sends, and utilize pre-amplifier 312,322,332 are amplified those acoustic signals, show with the feature with acoustic signals, and utmost point filter amplifier 313 after utilizing, 323,333 pairs have been carried out acoustic signals that prime the amplifies back level that tries again and have amplified and Filtering Processing, so that those acoustic signals can be by signal detection device 314,324,334 detectings (step S602), certainly, Filtering Processing also can utilize the additional element that adds to carry out Filtering Processing.

Now, the acoustic signals of 314,324,334 pairs of amplifications of signal detection device carry out zero-crossing signal detecting, producing the commutation pulse signal (step S603) of a plurality of high-end trim and low-end trim, and those commutation pulse signals are delivered to processing unit 34.34 foundations of processing unit, one default sampling frequency (fs) comes those commutation pulse signals are taken a sample, to obtain the corresponding sampled signal row of plural groups (step S604), wherein, default sampling frequency is according to those microphones 311 among Fig. 5,321,331 spacing is set, those sampled signals row be with

\overset{&RightArrow;}{x} 1, \overset{&RightArrow;}{x} 2, \overset{&RightArrow;}{x} 3 &Element; {1,0}

Expression, the sample length of those sampled signal row is L.

After processing unit 34 is obtained each sampled signal of organizing the voice signal that amplifies row, in step S605, obtain the plural groups mistiming by those sampled signal row of maximum similar value method cause again, that is each group mistiming is to be obtained by two groups of different sampled signal row, wherein, and those mistimings are with Δ 1, Δ 2, Δ 3 expressions, in the middle of, Δ 1 is

With

Mistiming, Δ 2 is

With

Mistiming, Δ 3 is With

Mistiming, wherein should maximum similar value method according to following letter formula running:

L (alx)=f (x|a), a is the element of A, x is the element of S, and if a is a Δ ₁, then

x = {\overset{&RightArrow;}{x}}_{1} (n) \cdot {\overset{&RightArrow;}{x}}_{2} (n + Δ_{1}),

If a is a Δ ₂, then

x = {\overset{&RightArrow;}{x}}_{2} (n) \cdot {\overset{&RightArrow;}{x}}_{3} (n + Δ_{2}),

If a is a Δ ₃, then

x = {\overset{&RightArrow;}{x}}_{3} (n) \cdot {\overset{&RightArrow;}{x}}_{1} (n + Δ_{3}),

A be possible mistiming and A ∈ 0, Δ _{Possible max}, Δ _{Possible max}Be Δ ₁, Δ ₂, Δ ₃In the maximum, { 1,0} for finding out those mistimings, and makes opposing L (alx)=f (x|a) maximization to S ∈.Because processing unit 34 handled signal ∈ 0,1}, so its relevant multiplying that is utilized when carrying out computing can replace by the AND logical operation, with the reduction operand.

Afterwards, in step S606, processing unit 34 is compared an incident angle corresponding tables (figure does not show) with those mistimings, this incident angle corresponding tables has plural time difference and corresponding plural incident angle, and this incident angle corresponding tables is according to microphone 311, the position of 321,331 ornaments, mistiming shown in Figure 2 and incident angle conversion synoptic diagram and mathematical operation formula

ΔT = \frac{aθ + a \sin θ}{c}

And built-in in advance finishing, certainly, processing unit 34 also can directly use the mathematical operation formula to obtain incident angle, but this will cause the computational burden of processing unit 34.

And because microphone 311,321, the stage that 331 reception voice signals are finished to taking a sample may have error slightly, in order to reduce error component, arithmetic element 34 is obtained can utilize earlier by working storage or impact damper behind the incident angle and is temporarily stored, and the previous identification number of times of setting of foundation heavily covers execution in step S604 once more, step S605 and step S606, to obtain a plurality of incident angles, then after getting rid of very big or minimum possible deviation numerical value (incident angle), again to the statistical processing such as average of sorting of those incident angles, to obtain a comparatively approaching incident angle (step S607), afterwards, obtain the position (step S608) of source of sound again by this incident angle.

By above explanation as can be known, the present invention mainly utilizes the transistorized prime amplification of two-carrier that the voice signal that microphone received is carried out the prime amplification, and utilization zero is handed over more, and detector is converted to voice signal the commutation pulse signal with high-end trim and low-end trim, then utilize processing unit that those commutation pulse signals are taken a sample, to obtain plural sampled signal row, for coming those sampled signal row are obtained the mistiming by maximum similar value method, utilize look-up table to obtain the voice signal incident angle at last again, to find out sound source position, can not need use analog-digital converter (ADC), be not subjected to the influence of Electret Condencer Microphone difference, do not use multiplication and must huge calculating and the direction of coming sound recognition.

The foregoing description only is to give an example for convenience of description, and the interest field that the present invention advocated should be as the criterion so that claim is described certainly, but not only limits to the foregoing description.

Claims

1, a kind of audio direction recognition device comprises:

A plurality of sources of sound are searched the unit, each this source of sound is searched the unit and is received an acoustic signals, and this acoustic signals amplified and Filtering Processing, to obtain an acoustic signals that amplifies, then the acoustic signals to this amplification carries out signal conversion processes, to obtain a commutation pulse signal, wherein, each this source of sound is searched the unit and is comprised a pre-amplifier and a signal detection device, for this acoustic signals is converted to this commutation pulse signal with high-end trim and low-end trim, aforementioned signal detection device is zero a friendship detector more; And

Wherein, those sampled signals row be with

{\overset{&RightArrow;}{x}}_{1}, {\overset{&RightArrow;}{x}}_{2}, {\overset{&RightArrow;}{x}}_{3} &Element; {1,0}

With

Mistiming, Δ ₂For

With Mistiming, Δ ₃For

With Mistiming; This maximum similar value method of reaching is to operate according to letter formula L (a|x)=f (x|a), in the middle of, a is the element of A, x is the element of S, and if a is a Δ ₁, then x is With

Inner product, if a is Δ ₂, then x is (n) and

(n+ Δ ₂) inner product, if a is Δ ₃, then x is

(n) and

2, audio direction recognition device as claimed in claim 1, wherein, each source of sound is searched the unit and is comprised a radio reception device and a back grade filter amplifier, this pre-amplifier is connected with this radio reception device and this back level filter amplifier respectively, and this signal detection device is connected with this post-amplifier and this processing unit respectively.

3, audio direction recognition device as claimed in claim 1, wherein, each pre-amplifier is as driver with a two-carrier transistor.

4, audio direction recognition device as claimed in claim 3, wherein, this transistor is a NPN transistor.

5, audio direction recognition device as claimed in claim 1, wherein, this processing unit is tabled look-up by an incident angle corresponding tables, this incident angle corresponding tables has plural Preset Time difference and plural incident angle value, being provided with those mistimings compares those Preset Time differences and tries to achieve corresponding incident pin degree value, to obtain the sound source location of those acoustic signals.

6, a kind of audio direction recognition methods comprises the steps:

One sampling procedure comes those commutation pulse signals are taken a sample according to this sample length parameter, and obtains the plural number mistiming by a maximum similar value method;

One step of tabling look-up contrasts an incident angle corresponding tables with those mistimings, to obtain plural acoustic signals incident angle, for the sound source position of being obtained those acoustic signals by those acoustic signals incident angles; And

One average step, after obtaining those acoustic signals incident angles with the step of tabling look-up when this, temporary those acoustic signals incident angles, and carry out inferior this sampling procedure of plural number and this action of tabling look-up according to this identification number of times parameter, for obtaining those acoustic signals incident angles of plural groups, to average processing;

Wherein, this maximum similar value method is according to letter formula L (a|x)=f (x|a) running, in the middle of, a is the element of A, x is the element of S, and if a is a Δ ₁, then x is

With

Inner product, if a is Δ ₂, then x is (n) with

(n+ Δ ₂) inner product, if a is Δ ₃, then x is

(n) with

(n+ Δ ₃) inner product, A be possible mistiming and A ∈ 0, Δ _{Possible max}, Δ _{Possible max}Be Δ ₁, Δ ₂With Δ ₃In the maximum, S ∈ 1,0} for finding out those mistimings, and makes opposing L (a|x)=f (x|a) maximization, in the middle of,

{\overset{&RightArrow;}{x}}_{1}, {\overset{&RightArrow;}{x}}_{2}, {\overset{&RightArrow;}{x}}_{3} &Element; {1,0}

Be the sampled signal row of those commutation pulse signals, L is this sample length parameter, Δ ₁, Δ ₂, Δ ₃Be those mistimings, Δ ₁For

With

Mistiming, Δ ₂For With

Mistiming, Δ ₃For

With

Mistiming.

7, audio direction recognition methods as claimed in claim 6 before carrying out aforementioned average treatment, is got rid of very big or minimum incident angle earlier.