CN201639751U

CN201639751U - Fixed-direction and fixed-distance voice collecting system based on multi-microphone array

Info

Publication number: CN201639751U
Application number: CN2010201291853U
Authority: CN
Inventors: 潘帆; 罗彬�; 沈斌; 覃树建
Original assignee: CHENGDU DANMANI TECHNOLOGY Co Ltd
Current assignee: CHENGDU DANMANI TECHNOLOGY Co Ltd
Priority date: 2010-03-11
Filing date: 2010-03-11
Publication date: 2010-11-17
Anticipated expiration: 2020-03-11

Abstract

The utility model discloses a fixed-direction and fixed-distance voice collecting system based on a multi-microphone array, which comprises a camera, the multi-microphone array, and a voice output device that is connected with the multi-microphone array; a wave-beam forming and calculating device and a wave-beam forming and processing device are sequentially arranged between the multi-microphone array and the voice output device; the camera is connected with the wave-beam forming and calculating device through a target positioning device. The voice collecting system provided by the utility model adopting the multi-microphone array technology can receive voice signals of the assigned target in a fixed-direction and fixed-distance way through forming directional wave beam, thereby leading the user to accurately monitor the voice signals of the suspicious target without interference and increasing monitoring efficiency.

Description

Directed spacing speech collecting system based on the multi-microphone array

Technical field

The utility model relates to audio signal and strengthens the field, specifically is meant a kind of directed spacing speech collecting system based on the multi-microphone array.

Background technology

General outputting video signal of existing supervisory control system generally taked with the direct form of gathering of single microphone for audio signal.If in the scene of more complicated, as the railway station, the square, flow of the people is big, the situation that many people speak simultaneously, the audio frequency of taking common audio signal sample mode to export will make the user can't tell the different people word.If the monitor staff finds that the suspicious figure is whispering to each other simultaneously, want to monitor their talk, at this moment common audio collection mode can't satisfy this demand.

The utility model content

The purpose of this utility model is to overcome the shortcoming and defect of above-mentioned prior art, a kind of directed spacing speech collecting system based on the multi-microphone array is provided, this speech collecting system adopts the multi-microphone array technique, can be by forming directional wave beam, the liftoff voice signal of accepting intended target of directed spacing, thereby can make the user get rid of the audio signal that interference listens to suspicious object exactly, improve monitoring efficiency.

The purpose of this utility model is achieved through the following technical solutions: a kind of directed spacing speech collecting system based on the multi-microphone array, comprise camera, multi-microphone array and the instantaneous speech power that links to each other with the multi-microphone array, be disposed with wave beam between described multi-microphone array and the instantaneous speech power and form calculation element and wave beam formation processing unit, camera forms calculation element by target locating set and wave beam and links to each other.

The operation principle of above-mentioned target locating set is: after the monitor staff finds suspicious object, single-frame images according to the camera shooting, target locating set positions the intended target on the image, this target locating set is made up of target range calculator and target direction calculator, wherein, the target range calculator can be according to known camera setting height(from bottom), luffing angle, and suspicious object picture size and empirical data calculate the distance from camera of target; The target direction calculator can be according to the camera deviation angle, and luffing angle and fixed setting calculate the angle of the relative camera calibration direction of target.Wave beam forms calculation element and utilizes information such as target range that target locating set calculates and target direction, calculates needed directional wave beam coefficient.

The operation principle that wave beam forms processing unit is: after wave beam formation calculation element calculates needed directional wave beam coefficient, wave beam forms processing unit and utilizes these directional wave beam coefficients, filter the multi-path voice signal that the multi-microphone array collects, this wave beam coefficient can provide high-gain by a voice signal to target direction and target place distance, suppress the voice signal of other directions and distance simultaneously, thereby reach the purpose of only accepting the intended target voice.

As further improvement of the utility model, described wave beam forms between processing unit and the instantaneous speech power and is provided with noise elimination apparatus, form the voice signal that processing unit is handled through wave beam, further take out noise in the voice signal by noise elimination apparatus, thereby can further improve quality of speech signal, be more conducive to the monitor staff and do not hear voice; Noise elimination apparatus is connected to instantaneous speech power at last, is used to export voice.

For guaranteeing to need to check most complete monitor data under the extreme case, described instantaneous speech power is connected with holder, is used to store the HD video video recording that camera collection arrives; Described instantaneous speech power is connected with holder, is used for storaged voice.

Above-mentioned target locating set is made up of target range calculator and target direction calculator.

It is the wave beam coefficient calculator that above-mentioned wave beam forms calculation element.

It is the sef-adapting filter group that above-mentioned wave beam forms processing unit.

Microphone array number of columns in the above-mentioned multi-microphone array is at least 2.

Above-mentioned instantaneous speech power is loud speaker, earphone or network.

In sum, the beneficial effects of the utility model are: utilize advanced directed spacing voice collecting technology, can get rid of other interference signals, effectively collect the voice signal of intended target, realize the task that the traditional voice collection can't be finished.Can help the monitor staff to monitor the suspicious object dialogue effectively, judge whether threat, great application value be arranged at public safety field.

Description of drawings

Fig. 1 is a structural representation of the present utility model.

Embodiment

Below in conjunction with embodiment and accompanying drawing, the utility model is described in further detail, but execution mode of the present utility model is not limited only to this.

Embodiment:

As shown in Figure 1, the utility model comprises camera, multi-microphone array and the instantaneous speech power that links to each other with the multi-microphone array, be provided with wave beam between described multi-microphone array and the instantaneous speech power and form calculation element, camera forms calculation element by target locating set and wave beam and links to each other, and wave beam forms and is connected with wave beam formation processing unit, noise elimination apparatus between calculation element and the instantaneous speech power in turn.

The course of work of the present utility model is: the intended target on the image that target locating set photographs camera positions, this target locating set is made up of target range calculator and target direction calculator, wherein, the target range calculator can be according to known camera setting height(from bottom), luffing angle, suspicious object picture size and empirical data calculate the distance from camera of target, concrete computational methods are: the luffing angle of known camera, known camera setting height(from bottom) of while, according to the triangle principle, just can calculate the distance of target to camera, we also utilize the empirical data that one group of repetition test test draws simultaneously, result of calculation is revised, thereby the distance of calculating that is is more accurate; The target direction calculator can be according to the camera deviation angle, luffing angle and fixed setting calculate the angle of the relative camera calibration direction of target, it is supreme that camera is erected at The Cloud Terrace, so we can know the deviation angle of camera, according to the distance of target at image centre to centre heart point, we just can calculate the angle of target to fixed setting then.

Through behind the target localization, wave beam forms direction and the distance of calculation element according to target, utilizes improved MUSIC algorithm computation to go out the coefficient of needed directional wave beam; After coefficient calculations is finished, wave beam forms processing unit can utilize these coefficients, wave beam form processing unit be one based on the FIR filter, while is in conjunction with the sef-adapting filter group of ICA Blind Signal Separation algorithm, the coefficient that this bank of filters utilizes wave beam formation calculation element to calculate is initial value, the voice signal that the multi-microphone array is collected carries out filtering, the result of filtering separates through the audio signal that ICA Blind Signal Separation algorithm further mixes then, the voice signal that result's meeting after the separation and original multi-microphone array collect compares and obtains an error signal, utilize this error signal constantly to adjust filter coefficient again by improved NLMS algorithm, thereby finally export stable, the voice signal of destination object clearly.

At last, form the voice signal that processing unit is exported, can handle by noise elimination apparatus by wave beam.The purpose of noise elimination apparatus is in order further to remove the noise information in the voice signal.Noise elimination apparatus can judge whether voice signal is arranged in the current input signal, if do not have, then be judged to be noise, and accumulation and calculating noise spectrum (the spectrum here promptly is the energy value that time-domain signal is converted to each frequency behind the frequency-region signal), as judge that current demand signal is voice, and then utilize the spectrum of current demand signal to deduct the noise spectrum that historical accumulation is calculated, then the result is converted to gain coefficient, again this gain coefficient is used for primary speech signal, thereby reaches the purpose of eliminating noise; Noise elimination apparatus is connected to instantaneous speech power at last, and instantaneous speech power is loud speaker, earphone or network, is used to export voice.

As mentioned above, just can realize the utility model preferably.

Claims

1. based on the directed spacing speech collecting system of multi-microphone array, comprise camera, multi-microphone array and the instantaneous speech power that links to each other with the multi-microphone array, it is characterized in that, be disposed with wave beam between described multi-microphone array and the instantaneous speech power and form calculation element and wave beam formation processing unit, camera forms calculation element by target locating set and wave beam and links to each other.

2. the directed spacing speech collecting system based on the multi-microphone array according to claim 1 is characterized in that, described wave beam forms between processing unit and the instantaneous speech power and is provided with noise elimination apparatus.

3. the directed spacing speech collecting system based on the multi-microphone array according to claim 1 is characterized in that described camera is connected with holder.

4. the directed spacing speech collecting system based on the multi-microphone array according to claim 1 is characterized in that described instantaneous speech power is connected with holder.

5. the directed spacing speech collecting system based on the multi-microphone array according to claim 1 is characterized in that described target locating set is made up of target range calculator and target direction calculator.

6. the directed spacing speech collecting system based on the multi-microphone array according to claim 1 is characterized in that, it is the wave beam coefficient calculator that described wave beam forms calculation element.

7. the directed spacing speech collecting system based on the multi-microphone array according to claim 1 is characterized in that, it is the sef-adapting filter group that described wave beam forms processing unit.

8. the directed spacing speech collecting system based on the multi-microphone array according to claim 1 is characterized in that the microphone array number of columns in the described multi-microphone array is at least 2.

9. the directed spacing speech collecting system based on the multi-microphone array according to claim 1 is characterized in that described instantaneous speech power is loud speaker, earphone or network.