CN103576126A

CN103576126A - Four-channel array sound source positioning system based on neural network

Info

Publication number: CN103576126A
Application number: CN201210264336.XA
Authority: CN
Inventors: 姜楠; 赛音; 傅洋; 张超
Original assignee: 姜楠
Priority date: 2012-07-27
Filing date: 2012-07-27
Publication date: 2014-02-12

Abstract

The invention discloses a sound positioning system design which can be used for human-computer interaction to replace keyboard input and is based on an array microphone. The sound source positioning technology of a microphone array refers to an array consisting of a plurality of microphones which are arranged according to a certain geometric structure. The same sound source arrives at different microphones at different time. Time delay when the sound source arrives at the different microphones is collected so as to calculate the position of the sound source and represent different function commands; an upper computer sends a corresponding action response to realize human-computer interaction. Sounds produced when fingers rap a desktop are collected; on the basis of a classic time delay estimation algorithm, a time delay estimation result is processed in combination with a BP neural network, a positioning position is determined, and an input key value, namely the keys the fingers click on, is obtained. As a result, a traditional PC keyboard can be replaced to some extent, and human-computer interaction of command input is realized.

Description

Four-way array sonic location system based on neural network

Technical field

The present invention relates to digital signal processing, Embedded System Design etc., relate in particular to multi channel signals source location technology.

Background technology

In the past few decades, human-computer interaction technology makes rapid progress, especially along with the universal of automatic electronic equipment with extend, and the fast development of having brought exponential type to the exploitation of built-in human-machine interaction mode.How by multiple input-output device and computing machine, to be the important content of multimedia technology research alternately.Development acoustic keyboard and acoustics input technology contribute to promote the progress of computer entry device, follow new technology continue to optimize and perfect, the low cost of commercialization input equipment, the environmental protection characteristic of low-power consumption and facilitate conveniently, will bring comparatively considerable economic benefit and social benefit.Auditory localization technology based on microphone array refers to by certain geometry arranges the array that several microphones of forming form.It is different that same sound source arrives time of different microphones.By gathering sound source, arrive the time delay of different microphones, calculate sound generation source position, characterize different command functions, it is corresponding that host computer sends corresponding action, to realize man-machine interaction.

System of the present invention has realized a kind of like this sound localization model based on array microphone, and on classical Time Delay Estimation Algorithms basis, the result of estimating in conjunction with BP Processing with Neural Network time delay, determines position location, realizes the human-computer interaction function of order input.

Summary of the invention

Technical matters to be solved by this invention is: under lower sampling rate, classical signals disposal route, for being positioned with very large error, can not realize the interpersonal mutual of dexterity.Under higher sampling rate, the data stream that signal is processed is huge, computing power is had relatively high expectations simultaneously, and the uninterrupted running cost of high load capacity is higher, is not suitable for input equipment.In order not increase under the prerequisite of computational load and sampling rate, utilize signal processing technology, overcome the factors such as noise, low-power consumption and obtain meticulousr sound source localization as much as possible, realize order input function, replace conventional keyboard as man-machine interaction mode, we study and use synchronized sampling fidelity, and time delay is estimated and BP nerual network technique is realized this system.

The present invention is for solving above technical matters, and the technical scheme adopting is: take data acquisition and sound fidelity hardware circuit as basis, by multichannel collecting, arrive sound, voice signal obtains Multidimensional numerical after analog to digital conversion, and then calculates sound source location.The signal that knocks tone signal and be a kind of similar shock pulse has compared with steep front when knocking signal while occurring, and belongs to non-stationary signal.Signal is produced by acoustical vibration, therefore generate and fluctuation knocking signal while occurring, amplitude strengthens rapidly, starts decay after generally experiencing 2 to 3 peak values, and the average velocity that sound is propagated in air is estimated as 340m/s conventionally.The range difference of 0.01m can cause the mistiming of 2.94 * 10-5s.And middling speed digital acquisition circuit can reach the frequency acquisition of 44Khz-200KHz, guaranteed to collect comparatively exactly the relative time that sound arrives different array element points, therefore calculate phase differential, just can reflect truly the relation between multiple signals, thereby calculating sound source position, obtain corresponding key assignments, realize the meaning of input equipment.

Ideal situation, desktop infinity and isotropy, in array, each array element is not exist under the assumed condition of impact of the factors such as passage is inconsistent, mutual coupling.But in actual engineering application, various errors are inevitable.The error of microphone array, the performance that is mainly reflected in sonic transducer is inconsistent, array element distance does not strictly equate the array structure error that causes and each array element interchannel amplitude, the phase error of whole microphone array data acquisition system (DAS).Desktop situation is complicated, and not of uniform size, shape differs, and the fine and close loose situation of wooden structures is different, the uncertain stack of multiple transmitted wave.Tradition Time Delay Estimation Algorithms runs into very large challenge on registration, and the present invention, in conjunction with neural network model, has effectively solved the universal applicability problem of algorithm.

Described hardware circuit design comprises acoustic pickup array, signal synchronization adjustment circuit, and signal fidelity Circuit tuning, multi-channel data acquisition board, PCI input is comfortable

Described Software for Design comprises, discrete signal processing, and denoising wave filter, broad sense cross correlation function method time delay is estimated, BP neural network.

Beneficial effect of the present invention is as follows:

The sound localization key assignments position that the present invention uses finger tapping desktop to produce, replaces conventional keyboard, and environmental protection is convenient.

The present invention is by adopting above-mentioned hardware circuit design, can be synchronously and fidelity collection multi-channel sound signal.

The present invention, by adopting above-mentioned software algorithm to design, can realize the location of sound signal source, for replacing keyboard identification key assignments.

Accompanying drawing explanation

Fig. 1 is overall system diagram of the present invention

Fig. 2 is sampled voice microphone distribution array figure of the present invention

Fig. 3 is hardware circuit diagram of the present invention

Fig. 4 is that hardware actual acquisition of the present invention is to the signal of Multi-path synchronous fidelity

Fig. 5 is the Time Delay Estimation Algorithms process flow diagram adopting in the present invention

Fig. 6 is the neural network algorithm network design figure adopting in the present invention

Fig. 7 is test result of the present invention and accuracy rate situation

Embodiment

Below in conjunction with drawings and the specific embodiments, the present invention is described in further details.

The present invention is based on the sampling of acoustic pickup array, and the synchronous fidelity of circuit is processed, and Data Acquisition Card carries out analog to digital conversion, by obtained a cover system of sound source position information by software algorithm processing signals, as shown in Figure 1.

Whole system mainly realizes by hardware and software.

Hardware circuit:

Main target is to gather near field, multichannel broadband sound signal (as Fig. 3).For the realization of following model provides physical basis.Wherein emphasis lays particular emphasis on the raising of multi-channel synchronous and two aspect performances of signal phase fidelity.Lower meter of the signals collecting of sensor is passed to sound wave in realization, and through fidelity circuit, synchronous multichannel collecting also amplifies, and then carries out passing to host computer after digital-to-analog conversion.

Wherein the distribution of acoustic pickup adopts linear array to distribute (as Fig. 2), conveniently in different occasions, moves and installation system.

Software section:

This algorithm process, input signal is for gathering desktop knocking as the hyperchannel microphone array of describing in Fig. 4

The key assignments knocking when design object is input signal collection is consistent with the key assignments that calculates output, is divided into two basic steps:

1. occurrence positions → window checking and debugging → band-pass filter → ask for cross-power spectrum → dimensionality reduction to neural network input sample is knocked in the intercepting of multichannel collecting signal → windowing

2. → create neural network (target output Chu Shiquanzhi Shen Jingyuangeshuo network Ceng Shuo Chuan Dihanshuo Xue Xisushuai performance function) → training network → network test → identification to export

Wherein, neural network creates and training process only calculates when system is moved for the first time, after this, preserves this neural network model parameter, and directly computational grid recognition result, guarantees lower operand, to realize system real-time response.

The software algorithm first that the present invention uses is broad sense cross correlation function method (GCC, Generalized Cross-Correlation), in the method for estimating, is most widely used in classical time delay.Broad sense cross correlation function method is by asking the cross-power spectrum between two signals, and in frequency domain, give certain weighting and suppress noise, in change, arrive time domain, thereby obtain the cross correlation function between two signals, relative time delay between corresponding two signals of peak of this cross correlation function, as shown in Figure 5.Via processing, tentatively obtain delay inequality, as the data of next stage neural network.

The software algorithm second portion that the present invention uses is the multilayer feedforward neural network based on error backpropagation algorithm, and BP neural network is the multilayer feedforward neural network based on error backpropagation algorithm (BP algorithm).That D.E.Rumelhart and J.L.McCelland and research group thereof studied and designed in 1986.BP algorithm is current most widely used Learning Algorithm, has nearly 90% Application of Neural Network based on BP algorithm.The popularization of WINDROW-HOFF algorithm in multilayer feedforward neural network.The adjustment of weights and threshold value adopts error back propagation, and carries out along the negative gradient direction of error change, and pattern is saequential transmission broadcast with error Back-Propagation and intersected and carry out, and finally makes network error reach minimal value or minimum value.

Error Tolerance wherein, stronger adaptability makes on the nonlinearity errons such as resolution system, environment, there is positive effect, and input sample is had to very strong recognition and classification ability.Adopt two layers of BP network hidden layer to adopt S type transport functions herein, output layer also adopts logarithm S type transport function, to limit output area, input dimension by classical time delay, estimated to sequence signature vector dimension size determine.The neuron number of output layer is determined according to the key assignments number that will identify.Be that L-M optimized algorithm is trained, this is a kind of batch mode algorithm, and after all inputs are all submitted, network is just updated.It has used Levenberg-Marquardt optimization method, thereby makes learning time shorter.The mse function of acquiescence is done Performance Evaluation function, and the square error between the network of usining output and target output t is as Performance Evaluation standard.Network structure as shown in Figure 6.

Finally, via hardware and software designing two portions structure, recognition result is: 45 groups of samples, and 1 group of failure, 44 groups of successes, discrimination 97.7%, as shown in Figure 7.

Claims

1. the four-way array sonic location system based on neural network, is characterized in that: use is pointed the position difference of knocking desktop and represented different key assignments key mappings, replaces the input mode of conventional keyboard.By multichannel collecting knocking sound, via the estimation of broad sense cross correlation function time delay and neural computing, obtain sound source position.

2. multichannel collecting system according to claim 1, is characterized in that: synchronous acquisition multi channel signals, amplification, fidelity audio system.

3. the auditory localization algorithm that broad sense cross correlation function according to claim 1 is combined with neural network, it is characterized in that, under middle low sampling rate condition, calculate accurately sound source position, opposing noise, the factors such as echo reflection, with test repeatedly under objective environment condition, discrimination reaches 95%.