CN109862498A

CN109862498A - A kind of digital deaf-aid sound source direction method based on convolutional neural networks

Info

Publication number: CN109862498A
Application number: CN201910077998.8A
Authority: CN
Inventors: 陈霏; 张雨晨
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2019-01-28
Filing date: 2019-01-28
Publication date: 2019-06-07

Abstract

The present invention discloses a kind of digital deaf-aid sound source direction method based on convolutional neural networks, comprising steps of the voice data of production training, is played the voice data and acquired using hearing aid and intelligent terminal；It is that output carries out neural metwork training, and the convolutional neural networks after the completion of training are inputted in intelligent terminal using the voice data that the training that hearing aid and intelligent terminal acquire is used as input, Sounnd source direction data；When use, hearing aid and intelligent terminal are established and communicated to connect, hearing aid sends intelligent terminal to after receiving external voice, and the convolutional neural networks send back to hearing aid after exporting Sounnd source direction data according to the received voice data of hearing aid and the voice data of intelligent terminal real-time reception.The present invention can accurately obtain directional information of the sound source relative to hearing aid user.

Description

A kind of digital deaf-aid sound source direction method based on convolutional neural networks

Technical field

The present invention relates to sound source direction technical fields, more particularly to a kind of digital deaf-aid based on convolutional neural networks Sound source direction method.

Background technique

Hearing loss is one of several chronic diseases common now, is especially mainly in the elderly.According to the World Health Organization The information announced in 2018, the whole world have 4.66 hundred million people with disabling hearing loss, cause to spend up to 750,000,000,000 dollars every year, Over-65s the elderly more than one third suffers from disabling hearing loss.Influence of the hearing loss to patient be it is huge, For example, it is light, in, the elderly of severe hearing loss, the illness rate of Alzheimer's disease is 2 times of the normal old man of hearing, 3 respectively Times and 5 times.And hearing loss is also possible to that the extremely serious psychological problems such as severe insomnia, cognitive decline, depression can be caused.

Hearing aid can play the hearing improved of hearing loss person certain booster action, and to hearing loss Recovery also has great help.Therefore, the World Health Organization suggests that disabling hearing loss person wears suitable hearing aid.Tradition Analog aids use linear amplifier circuit, all audio signals of input are subjected to Linear Amplifer processing.But due to Hearing aid user is often insensitive to voice signal, can usually generate the awkward situation of " small sound is not heard, loudly listens to feel bad ".For This problem is solved, the digital deaf-aid that can be amplified according to user's difference hearing loss situation comes into being.Number helps It listens device to need to carry out a series of processing to the voice signal received, however often exists in the living environment of hearing aid user Each noise like generates tremendous influence to the voice processing effect of hearing aid.If hearing aid can be light as human brain Pine determines the direction of sound source, so that it may carry out voice direction enhancing etc. and be further processed, greatly improve the usage experience of user. But it existing digital deaf-aid or does not utilize sound source direction technology now or is utilized that calculating is complicated but effect is general Traditional sound source direction technology, there are also very big rooms for promotion for voice processing effect.

Summary of the invention

In view of the technical drawbacks of the prior art, it is an object of the present invention to provide one kind to be based on convolutional neural networks Digital deaf-aid sound source direction method construct a kind of better hearing aid sound of effect by convolutional neural networks model Source orientation method, it is intended to the problems such as solving insufficient existing hearing aid Sounnd source direction information, Speech processing less effective, to It improves the usage experience of hearing aid user and hearing aid sound source direction accuracy rate can be improved.

The technical solution adopted to achieve the purpose of the present invention is:

A kind of digital deaf-aid sound source direction method based on convolutional neural networks, comprising the following steps:

The voice data for making training is played the voice data and is acquired using hearing aid and intelligent terminal；

Using hearing aid and the voice data used of training of intelligent terminal acquisition as input, Sounnd source direction data be export into Row neural metwork training, and will be in the convolutional neural networks input intelligent terminal after the completion of training；

It is communicated to connect in use, hearing aid and intelligent terminal are established, hearing aid sends intelligence to after receiving external voice Energy terminal, the convolutional neural networks are according to the received voice data of hearing aid and the voice number of intelligent terminal real-time reception According to, export Sounnd source direction data after send back to hearing aid.

Before inputting voice data into the convolutional neural networks, further include the steps that voice data pre-treatment: Voice data is pre-processed first, then phonic signal character is extracted with mel cepstrum coefficients method, voice signal is passed through into Meier Filter group is converted into 24 dimensional feature signals；Frame Size Adjustment when by voice framing, the letter for generating Meier filter group Number be 24 × 24 Data Cube；

By the left ear amplitude information in hearing aid or so whispering voice data, left ear phase information, auris dextra range signal, auris dextra The data of five dimensions of the amplitude information in phase information and intelligent terminal voice data, are integrated into one 24 × 24 × 5 Matrix is inputted as convolutional neural networks, obtains Sounnd source direction data.

The convolutional neural networks are seven-layer structure, are followed successively by the first convolutional layer, pond layer, second from output is input to Convolutional layer, three layers of full articulamentum, softmax layers will obtain 90 dimensional vectors after inputting 24 × 24 × 5 voice data matrix Output.

Compared with prior art, the beneficial effects of the present invention are:

The present invention is pre-processed by same section of voice signal for receiving the microphone of hearing aid and intelligent terminal After integration, transfers to that the convolutional neural networks finished has been trained to carry out operation, obtain side of the sound source relative to hearing aid user To information, so that hearing aid carries out the further speech processes such as voice de-noising, speech recognition.

Detailed description of the invention

Fig. 1 is the flow chart of the digital deaf-aid sound source direction method the present invention is based on convolutional neural networks.

Fig. 2 is the structural schematic diagram of convolutional neural networks used in the present invention.

Specific embodiment

The present invention is described in further detail below in conjunction with the drawings and specific embodiments.It should be appreciated that described herein Specific embodiment be only used to explain the present invention, be not intended to limit the present invention.

Method is determined the present invention is based on convolutional neural networks and using the digital deaf-aid Sounnd source direction that intelligent terminal is assisted, It is handed over after being pre-processed and integrated by same section of voice signal for receiving the microphone of hearing aid and intelligent terminal By having trained the convolutional neural networks finished to carry out operation, to obtain directional information of the sound source relative to hearing aid user 's.

As shown in Figure 1, the present invention is based on the digital deaf-aid sound source direction method of convolutional neural networks, including following step It is rapid:

Using hearing aid and the voice data of intelligent terminal acquisition as the input of convolutional neural networks, Sounnd source direction data are Output, is trained convolutional neural networks, and will be in the convolutional neural networks input intelligent terminal after the completion of training；

In use, by hearing aid and intelligent terminal by bluetooth connection, when hearing aid and intelligent terminal are in connection status When, the voice data being collected into is sent intelligent terminal by hearing aid, and convolutional neural networks described in intelligent terminal are then with hearing aid The voice data being collected into and intelligent terminal itself are received voice data by device, for input, export Sounnd source direction Data Concurrent Back to hearing aid.

Wherein, after hearing aid receives bearing data, Digital Signal Processing (the Digital Signal of hearing aid Processing, DSP) module carries out further speech processes accordingly.

Specifically, can be with the voice data from TIMIT corpus Yu NoiseX-92 noise library, by the two according to Different signal-to-noise ratio mixing are used as training data, play and resurvey these voice data using hearing aid and intelligent terminal, with These training datas are inputted computer later by actual conditions when simulated hearing aid uses.Wherein, it is mentioned in the computer It is preceding to write neural network procedure using TensorFlow machine learning frame.The voice data input computer resurveyed Afterwards, according to the training method of convolutional neural networks, the network is trained by training data.When training reaches changing for setting After generation number, the convolutional neural networks that finish of training are saved, then are imported in the mating application program app of terminal of hearing aid for making With.

In use, hearing aid and intelligent terminal are attached by bluetooth first.When the two is in connection status, such as Voice activity detection (Voice Activity Detection, VAD) module in fruit hearing aid detects voice data, then will The voice data that left and right microphone is collected into is sent to terminal.Voice that terminal is received using itself and that hearing aid is sent Data determine the direction of sound source using convolutional neural networks sound source direction algorithm of the present invention, and bearing data is sent back to To hearing aid, so that hearing aid Digital Signal Processing (Digital Signal Processing, DSP) module carries out further Speech processes.

The present invention is using the microphone of hearing aid and intelligent terminal collectively as voice signal source.When detect voice believe Number when, collect the voice data of three sections of separate sources such as left ear hearing aid, auris dextra hearing aid, terminal microphone respectively.It is right first Every section of voice data is pre-processed, i.e. the operations such as adding window, framing and Short Time Fourier Transform.Recycle mel cepstrum coefficients method Phonic signal character is extracted, by voice signal by Meier filter group, is converted into 24 dimensional feature signals.By to voice framing When frame length be adjusted, make Meier filter group generate signal be exactly 24 × 24 Data Cube, be convenient for convolutional Neural Network is handled.Since the position of left and right ear hearing aid determines relatively, and intelligent terminal can be generated with respect to the position of hearing aid Variation, therefore the signal that the present invention obtains left and right ear hearing aid retains amplitude information and phase information as main signal；It will The signal that terminal microphone obtains only retains amplitude information as auxiliary signal.By left ear amplitude, left ear phase, auris dextra amplitude, The data of five dimensions such as auris dextra phase and terminal amplitude, are integrated into one 24 × 24 × 5 matrix, transfer to convolutional neural networks Operation is carried out, the direction value of sound source may finally be obtained.

The intelligent terminal can be mobile phone, tablet computer or other Portable intelligent terminal devices.

The convolutional neural networks that the present invention uses be seven-layer structure, from be input to output be followed successively by convolutional layer 1, pond layer, 2, three layers of convolutional layer full articulamentum, softmax layers, input 24 × 24 × 5 voice data matrix after will obtain one 90 tie up to Amount output.

It is the concrete operation method of each layer of convolutional neural networks used in the present invention below:

Step 1: passing through convolutional layer 1.The convolutional layer filter size used is 5 × 5, depth 5, step-length 1.According to defeated Matrix Computation Formulas out,

Out in formula, in, filter, step respectively indicate this layer output, input, filter, step-length, and subscript l and w then divides The length and width of this attribute are not indicated.From the above equation, we can see that can be obtained after 24 × 24 × 5 primary data Input matrix layer 20 × 20 × 5 output.For each 1 × 1 × 5 unit-node matrix, useWith a_x,y,zIt respectively indicates for this I-th of node in unit-node matrix, the weight and input value of corresponding input node (x, y, z), uses bⁱIndicate i-th of section The corresponding biasing of point.Then i-th of node value H (i) in the unit-node are as follows:

F (x)=ReLU (x)=max (0, x)

F (x) in formula is common activation primitive line rectification function (Rectified Linear in neural network Unit,ReLU)。

Step 2: passing through pond layer.This layer is 2 maximum pond layer using step-length, and input matrix is split as multiple 2 × 2 Submatrix, retain four elements of submatrix in maximum value, and form a new matrix.Each depth is used identical Splitting and reorganizing operation, after the matrix of input 20 × 20 × 5, by the output matrix of this layer of operation available 10 × 10 × 5.

Step 3: passing through convolutional layer 2.The convolutional layer filter size is 5 × 5, depth 16, step-length 1.Using with step The value of node is changed to (i=1,2 ..., 16) by rapid 1 similar method and formula, by this layer of operation available 6 × 6 × 16 Output matrix.

Step 4: passing through full articulamentum 1.1 × 1 × 360 output matrix, i.e. output node number are obtained after this layer of operation It is 360, wherein i-th of output node g (i) value are as follows:

Step 5: passing through full articulamentum 2.This layer of input node number is 360, and output node number is 180.It usesWith a_xThe weight and input value for i-th of output node are respectively indicated, then i-th of output node g (i) value are as follows:

Step 6: passing through full articulamentum 3.This layer of input node number is 180, and output node number is 90, calculation method with Step 5 is similar.A softmax (normalization exponential function) layer is reused after the output layer, converts output g (i) to generally Rate distribution output o (i), it may be assumed that

By the operation of convolutional neural networks, 90 dimensional vectors for meeting probability distribution have been finally obtained.By direction week Angle is divided into 90 parts of 4 ° of interval, and 90 dimensional vectors respectively correspond from front and originate clockwise 90 directions.When certain in vector When the numerical value tieed up a bit is significantly more than other dimension values, it is believed that sound source is located at the direction.

In the present invention, when hearing aid VAD module detects voice signal, hearing aid or so ear and intelligent terminal microphone Voice signal will be received simultaneously and pre-processed, and then transfer to that the convolutional neural networks finished trained to carry out operation, thus It can obtain directional information of the sound source relative to hearing aid user.

The present invention applies to convolutional neural networks in hearing aid sound source direction algorithm, since convolutional neural networks possess directly It connects and the advantage that convolution extracts feature is carried out to voice frequency point, therefore can to greatly improve hearing aid sound source direction accurate by the present invention Rate greatly promotes the usage experience of hearing aid user in order to which speech-oriented enhancing, identification etc. are further processed.

The above is only a preferred embodiment of the present invention, it is noted that for the common skill of the art For art personnel, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications Also it should be regarded as protection scope of the present invention.

Claims

1. a kind of digital deaf-aid sound source direction method based on convolutional neural networks, which comprises the following steps:

It is that output carries out mind using the voice data that the training of hearing aid and intelligent terminal acquisition is used as input, Sounnd source direction data Through network training, and will be in the convolutional neural networks input intelligent terminal after the completion of training；

It is communicated to connect in use, hearing aid and intelligent terminal are established, hearing aid sends intelligent end to after receiving external voice End, the convolutional neural networks are defeated according to the received voice data of hearing aid and the voice data of intelligent terminal real-time reception Hearing aid is sent back to after sound source bearing data out.

2. the digital deaf-aid sound source direction method based on convolutional neural networks as described in claim 1, which is characterized in that inciting somebody to action Voice data is input to before the convolutional neural networks, is further included the steps that voice data pre-treatment: first to voice number Then Data preprocess extracts phonic signal character with mel cepstrum coefficients method, voice signal is converted by Meier filter group For 24 dimensional feature signals；Frame Size Adjustment when by voice framing, the signal 24 × 24 for generating Meier filter group Data Cube；

By left ear amplitude information, left ear phase information, the auris dextra range signal, auris dextra phase in hearing aid or so whispering voice data The data of five dimensions of the amplitude information in information and intelligent terminal voice data, are integrated into one 24 × 24 × 5 matrix It is inputted as convolutional neural networks, obtains Sounnd source direction data.

3. the digital deaf-aid sound source direction method based on convolutional neural networks as claimed in claim 2, which is characterized in that described Convolutional neural networks be seven-layer structure, be followed successively by the first convolutional layer, pond layer, the second convolutional layer, three layers from output is input to Full articulamentum, softmax layers will obtain a 90 dimensional vectors output after inputting 24 × 24 × 5 voice data matrix.