CN109862498A - A kind of digital deaf-aid sound source direction method based on convolutional neural networks - Google Patents

A kind of digital deaf-aid sound source direction method based on convolutional neural networks Download PDF

Info

Publication number
CN109862498A
CN109862498A CN201910077998.8A CN201910077998A CN109862498A CN 109862498 A CN109862498 A CN 109862498A CN 201910077998 A CN201910077998 A CN 201910077998A CN 109862498 A CN109862498 A CN 109862498A
Authority
CN
China
Prior art keywords
hearing aid
neural networks
convolutional neural
voice data
intelligent terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910077998.8A
Other languages
Chinese (zh)
Inventor
陈霏
张雨晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201910077998.8A priority Critical patent/CN109862498A/en
Publication of CN109862498A publication Critical patent/CN109862498A/en
Pending legal-status Critical Current

Links

Landscapes

  • Circuit For Audible Band Transducer (AREA)

Abstract

The present invention discloses a kind of digital deaf-aid sound source direction method based on convolutional neural networks, comprising steps of the voice data of production training, is played the voice data and acquired using hearing aid and intelligent terminal;It is that output carries out neural metwork training, and the convolutional neural networks after the completion of training are inputted in intelligent terminal using the voice data that the training that hearing aid and intelligent terminal acquire is used as input, Sounnd source direction data;When use, hearing aid and intelligent terminal are established and communicated to connect, hearing aid sends intelligent terminal to after receiving external voice, and the convolutional neural networks send back to hearing aid after exporting Sounnd source direction data according to the received voice data of hearing aid and the voice data of intelligent terminal real-time reception.The present invention can accurately obtain directional information of the sound source relative to hearing aid user.

Description

A kind of digital deaf-aid sound source direction method based on convolutional neural networks
Technical field
The present invention relates to sound source direction technical fields, more particularly to a kind of digital deaf-aid based on convolutional neural networks Sound source direction method.
Background technique
Hearing loss is one of several chronic diseases common now, is especially mainly in the elderly.According to the World Health Organization The information announced in 2018, the whole world have 4.66 hundred million people with disabling hearing loss, cause to spend up to 750,000,000,000 dollars every year, Over-65s the elderly more than one third suffers from disabling hearing loss.Influence of the hearing loss to patient be it is huge, For example, it is light, in, the elderly of severe hearing loss, the illness rate of Alzheimer's disease is 2 times of the normal old man of hearing, 3 respectively Times and 5 times.And hearing loss is also possible to that the extremely serious psychological problems such as severe insomnia, cognitive decline, depression can be caused.
Hearing aid can play the hearing improved of hearing loss person certain booster action, and to hearing loss Recovery also has great help.Therefore, the World Health Organization suggests that disabling hearing loss person wears suitable hearing aid.Tradition Analog aids use linear amplifier circuit, all audio signals of input are subjected to Linear Amplifer processing.But due to Hearing aid user is often insensitive to voice signal, can usually generate the awkward situation of " small sound is not heard, loudly listens to feel bad ".For This problem is solved, the digital deaf-aid that can be amplified according to user's difference hearing loss situation comes into being.Number helps It listens device to need to carry out a series of processing to the voice signal received, however often exists in the living environment of hearing aid user Each noise like generates tremendous influence to the voice processing effect of hearing aid.If hearing aid can be light as human brain Pine determines the direction of sound source, so that it may carry out voice direction enhancing etc. and be further processed, greatly improve the usage experience of user. But it existing digital deaf-aid or does not utilize sound source direction technology now or is utilized that calculating is complicated but effect is general Traditional sound source direction technology, there are also very big rooms for promotion for voice processing effect.
Summary of the invention
In view of the technical drawbacks of the prior art, it is an object of the present invention to provide one kind to be based on convolutional neural networks Digital deaf-aid sound source direction method construct a kind of better hearing aid sound of effect by convolutional neural networks model Source orientation method, it is intended to the problems such as solving insufficient existing hearing aid Sounnd source direction information, Speech processing less effective, to It improves the usage experience of hearing aid user and hearing aid sound source direction accuracy rate can be improved.
The technical solution adopted to achieve the purpose of the present invention is:
A kind of digital deaf-aid sound source direction method based on convolutional neural networks, comprising the following steps:
The voice data for making training is played the voice data and is acquired using hearing aid and intelligent terminal;
Using hearing aid and the voice data used of training of intelligent terminal acquisition as input, Sounnd source direction data be export into Row neural metwork training, and will be in the convolutional neural networks input intelligent terminal after the completion of training;
It is communicated to connect in use, hearing aid and intelligent terminal are established, hearing aid sends intelligence to after receiving external voice Energy terminal, the convolutional neural networks are according to the received voice data of hearing aid and the voice number of intelligent terminal real-time reception According to, export Sounnd source direction data after send back to hearing aid.
Before inputting voice data into the convolutional neural networks, further include the steps that voice data pre-treatment: Voice data is pre-processed first, then phonic signal character is extracted with mel cepstrum coefficients method, voice signal is passed through into Meier Filter group is converted into 24 dimensional feature signals;Frame Size Adjustment when by voice framing, the letter for generating Meier filter group Number be 24 × 24 Data Cube;
By the left ear amplitude information in hearing aid or so whispering voice data, left ear phase information, auris dextra range signal, auris dextra The data of five dimensions of the amplitude information in phase information and intelligent terminal voice data, are integrated into one 24 × 24 × 5 Matrix is inputted as convolutional neural networks, obtains Sounnd source direction data.
The convolutional neural networks are seven-layer structure, are followed successively by the first convolutional layer, pond layer, second from output is input to Convolutional layer, three layers of full articulamentum, softmax layers will obtain 90 dimensional vectors after inputting 24 × 24 × 5 voice data matrix Output.
Compared with prior art, the beneficial effects of the present invention are:
The present invention is pre-processed by same section of voice signal for receiving the microphone of hearing aid and intelligent terminal After integration, transfers to that the convolutional neural networks finished has been trained to carry out operation, obtain side of the sound source relative to hearing aid user To information, so that hearing aid carries out the further speech processes such as voice de-noising, speech recognition.
Detailed description of the invention
Fig. 1 is the flow chart of the digital deaf-aid sound source direction method the present invention is based on convolutional neural networks.
Fig. 2 is the structural schematic diagram of convolutional neural networks used in the present invention.
Specific embodiment
The present invention is described in further detail below in conjunction with the drawings and specific embodiments.It should be appreciated that described herein Specific embodiment be only used to explain the present invention, be not intended to limit the present invention.
Method is determined the present invention is based on convolutional neural networks and using the digital deaf-aid Sounnd source direction that intelligent terminal is assisted, It is handed over after being pre-processed and integrated by same section of voice signal for receiving the microphone of hearing aid and intelligent terminal By having trained the convolutional neural networks finished to carry out operation, to obtain directional information of the sound source relative to hearing aid user 's.
As shown in Figure 1, the present invention is based on the digital deaf-aid sound source direction method of convolutional neural networks, including following step It is rapid:
The voice data for making training is played the voice data and is acquired using hearing aid and intelligent terminal;
Using hearing aid and the voice data of intelligent terminal acquisition as the input of convolutional neural networks, Sounnd source direction data are Output, is trained convolutional neural networks, and will be in the convolutional neural networks input intelligent terminal after the completion of training;
In use, by hearing aid and intelligent terminal by bluetooth connection, when hearing aid and intelligent terminal are in connection status When, the voice data being collected into is sent intelligent terminal by hearing aid, and convolutional neural networks described in intelligent terminal are then with hearing aid The voice data being collected into and intelligent terminal itself are received voice data by device, for input, export Sounnd source direction Data Concurrent Back to hearing aid.
Wherein, after hearing aid receives bearing data, Digital Signal Processing (the Digital Signal of hearing aid Processing, DSP) module carries out further speech processes accordingly.
Specifically, can be with the voice data from TIMIT corpus Yu NoiseX-92 noise library, by the two according to Different signal-to-noise ratio mixing are used as training data, play and resurvey these voice data using hearing aid and intelligent terminal, with These training datas are inputted computer later by actual conditions when simulated hearing aid uses.Wherein, it is mentioned in the computer It is preceding to write neural network procedure using TensorFlow machine learning frame.The voice data input computer resurveyed Afterwards, according to the training method of convolutional neural networks, the network is trained by training data.When training reaches changing for setting After generation number, the convolutional neural networks that finish of training are saved, then are imported in the mating application program app of terminal of hearing aid for making With.
In use, hearing aid and intelligent terminal are attached by bluetooth first.When the two is in connection status, such as Voice activity detection (Voice Activity Detection, VAD) module in fruit hearing aid detects voice data, then will The voice data that left and right microphone is collected into is sent to terminal.Voice that terminal is received using itself and that hearing aid is sent Data determine the direction of sound source using convolutional neural networks sound source direction algorithm of the present invention, and bearing data is sent back to To hearing aid, so that hearing aid Digital Signal Processing (Digital Signal Processing, DSP) module carries out further Speech processes.
The present invention is using the microphone of hearing aid and intelligent terminal collectively as voice signal source.When detect voice believe Number when, collect the voice data of three sections of separate sources such as left ear hearing aid, auris dextra hearing aid, terminal microphone respectively.It is right first Every section of voice data is pre-processed, i.e. the operations such as adding window, framing and Short Time Fourier Transform.Recycle mel cepstrum coefficients method Phonic signal character is extracted, by voice signal by Meier filter group, is converted into 24 dimensional feature signals.By to voice framing When frame length be adjusted, make Meier filter group generate signal be exactly 24 × 24 Data Cube, be convenient for convolutional Neural Network is handled.Since the position of left and right ear hearing aid determines relatively, and intelligent terminal can be generated with respect to the position of hearing aid Variation, therefore the signal that the present invention obtains left and right ear hearing aid retains amplitude information and phase information as main signal;It will The signal that terminal microphone obtains only retains amplitude information as auxiliary signal.By left ear amplitude, left ear phase, auris dextra amplitude, The data of five dimensions such as auris dextra phase and terminal amplitude, are integrated into one 24 × 24 × 5 matrix, transfer to convolutional neural networks Operation is carried out, the direction value of sound source may finally be obtained.
The intelligent terminal can be mobile phone, tablet computer or other Portable intelligent terminal devices.
The convolutional neural networks that the present invention uses be seven-layer structure, from be input to output be followed successively by convolutional layer 1, pond layer, 2, three layers of convolutional layer full articulamentum, softmax layers, input 24 × 24 × 5 voice data matrix after will obtain one 90 tie up to Amount output.
It is the concrete operation method of each layer of convolutional neural networks used in the present invention below:
Step 1: passing through convolutional layer 1.The convolutional layer filter size used is 5 × 5, depth 5, step-length 1.According to defeated Matrix Computation Formulas out,
Out in formula, in, filter, step respectively indicate this layer output, input, filter, step-length, and subscript l and w then divides The length and width of this attribute are not indicated.From the above equation, we can see that can be obtained after 24 × 24 × 5 primary data Input matrix layer 20 × 20 × 5 output.For each 1 × 1 × 5 unit-node matrix, useWith ax,y,zIt respectively indicates for this I-th of node in unit-node matrix, the weight and input value of corresponding input node (x, y, z), uses biIndicate i-th of section The corresponding biasing of point.Then i-th of node value H (i) in the unit-node are as follows:
F (x)=ReLU (x)=max (0, x)
F (x) in formula is common activation primitive line rectification function (Rectified Linear in neural network Unit,ReLU)。
Step 2: passing through pond layer.This layer is 2 maximum pond layer using step-length, and input matrix is split as multiple 2 × 2 Submatrix, retain four elements of submatrix in maximum value, and form a new matrix.Each depth is used identical Splitting and reorganizing operation, after the matrix of input 20 × 20 × 5, by the output matrix of this layer of operation available 10 × 10 × 5.
Step 3: passing through convolutional layer 2.The convolutional layer filter size is 5 × 5, depth 16, step-length 1.Using with step The value of node is changed to (i=1,2 ..., 16) by rapid 1 similar method and formula, by this layer of operation available 6 × 6 × 16 Output matrix.
Step 4: passing through full articulamentum 1.1 × 1 × 360 output matrix, i.e. output node number are obtained after this layer of operation It is 360, wherein i-th of output node g (i) value are as follows:
Step 5: passing through full articulamentum 2.This layer of input node number is 360, and output node number is 180.It usesWith axThe weight and input value for i-th of output node are respectively indicated, then i-th of output node g (i) value are as follows:
Step 6: passing through full articulamentum 3.This layer of input node number is 180, and output node number is 90, calculation method with Step 5 is similar.A softmax (normalization exponential function) layer is reused after the output layer, converts output g (i) to generally Rate distribution output o (i), it may be assumed that
By the operation of convolutional neural networks, 90 dimensional vectors for meeting probability distribution have been finally obtained.By direction week Angle is divided into 90 parts of 4 ° of interval, and 90 dimensional vectors respectively correspond from front and originate clockwise 90 directions.When certain in vector When the numerical value tieed up a bit is significantly more than other dimension values, it is believed that sound source is located at the direction.
In the present invention, when hearing aid VAD module detects voice signal, hearing aid or so ear and intelligent terminal microphone Voice signal will be received simultaneously and pre-processed, and then transfer to that the convolutional neural networks finished trained to carry out operation, thus It can obtain directional information of the sound source relative to hearing aid user.
The present invention applies to convolutional neural networks in hearing aid sound source direction algorithm, since convolutional neural networks possess directly It connects and the advantage that convolution extracts feature is carried out to voice frequency point, therefore can to greatly improve hearing aid sound source direction accurate by the present invention Rate greatly promotes the usage experience of hearing aid user in order to which speech-oriented enhancing, identification etc. are further processed.
The above is only a preferred embodiment of the present invention, it is noted that for the common skill of the art For art personnel, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications Also it should be regarded as protection scope of the present invention.

Claims (3)

1. a kind of digital deaf-aid sound source direction method based on convolutional neural networks, which comprises the following steps:
The voice data for making training is played the voice data and is acquired using hearing aid and intelligent terminal;
It is that output carries out mind using the voice data that the training of hearing aid and intelligent terminal acquisition is used as input, Sounnd source direction data Through network training, and will be in the convolutional neural networks input intelligent terminal after the completion of training;
It is communicated to connect in use, hearing aid and intelligent terminal are established, hearing aid sends intelligent end to after receiving external voice End, the convolutional neural networks are defeated according to the received voice data of hearing aid and the voice data of intelligent terminal real-time reception Hearing aid is sent back to after sound source bearing data out.
2. the digital deaf-aid sound source direction method based on convolutional neural networks as described in claim 1, which is characterized in that inciting somebody to action Voice data is input to before the convolutional neural networks, is further included the steps that voice data pre-treatment: first to voice number Then Data preprocess extracts phonic signal character with mel cepstrum coefficients method, voice signal is converted by Meier filter group For 24 dimensional feature signals;Frame Size Adjustment when by voice framing, the signal 24 × 24 for generating Meier filter group Data Cube;
By left ear amplitude information, left ear phase information, the auris dextra range signal, auris dextra phase in hearing aid or so whispering voice data The data of five dimensions of the amplitude information in information and intelligent terminal voice data, are integrated into one 24 × 24 × 5 matrix It is inputted as convolutional neural networks, obtains Sounnd source direction data.
3. the digital deaf-aid sound source direction method based on convolutional neural networks as claimed in claim 2, which is characterized in that described Convolutional neural networks be seven-layer structure, be followed successively by the first convolutional layer, pond layer, the second convolutional layer, three layers from output is input to Full articulamentum, softmax layers will obtain a 90 dimensional vectors output after inputting 24 × 24 × 5 voice data matrix.
CN201910077998.8A 2019-01-28 2019-01-28 A kind of digital deaf-aid sound source direction method based on convolutional neural networks Pending CN109862498A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910077998.8A CN109862498A (en) 2019-01-28 2019-01-28 A kind of digital deaf-aid sound source direction method based on convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910077998.8A CN109862498A (en) 2019-01-28 2019-01-28 A kind of digital deaf-aid sound source direction method based on convolutional neural networks

Publications (1)

Publication Number Publication Date
CN109862498A true CN109862498A (en) 2019-06-07

Family

ID=66896296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910077998.8A Pending CN109862498A (en) 2019-01-28 2019-01-28 A kind of digital deaf-aid sound source direction method based on convolutional neural networks

Country Status (1)

Country Link
CN (1) CN109862498A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112201259A (en) * 2020-09-23 2021-01-08 北京百度网讯科技有限公司 Sound source positioning method, device, equipment and computer storage medium
CN113924786A (en) * 2019-06-09 2022-01-11 根特大学 Neural network model for cochlear mechanics and processing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106331973A (en) * 2016-10-20 2017-01-11 天津大学 Realization method of hearing-aid filter bank based on portable terminal
WO2017191249A1 (en) * 2016-05-06 2017-11-09 Robert Bosch Gmbh Speech enhancement and audio event detection for an environment with non-stationary noise
CN108024188A (en) * 2017-09-30 2018-05-11 天津大学 A kind of high intelligibility voice de-noising algorithm based on intelligent terminal
CN108122559A (en) * 2017-12-21 2018-06-05 北京工业大学 Binaural sound sources localization method based on deep learning in a kind of digital deaf-aid
CN108717178A (en) * 2018-04-12 2018-10-30 福州瑞芯微电子股份有限公司 A kind of sound localization method and device based on neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017191249A1 (en) * 2016-05-06 2017-11-09 Robert Bosch Gmbh Speech enhancement and audio event detection for an environment with non-stationary noise
CN106331973A (en) * 2016-10-20 2017-01-11 天津大学 Realization method of hearing-aid filter bank based on portable terminal
CN108024188A (en) * 2017-09-30 2018-05-11 天津大学 A kind of high intelligibility voice de-noising algorithm based on intelligent terminal
CN108122559A (en) * 2017-12-21 2018-06-05 北京工业大学 Binaural sound sources localization method based on deep learning in a kind of digital deaf-aid
CN108717178A (en) * 2018-04-12 2018-10-30 福州瑞芯微电子股份有限公司 A kind of sound localization method and device based on neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谈雅文,王立杰,姚昕羽,汤一彬,周琳: "基于BP神经网络的双耳声源定位算法", 《电声技术》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113924786A (en) * 2019-06-09 2022-01-11 根特大学 Neural network model for cochlear mechanics and processing
US11800301B2 (en) 2019-06-09 2023-10-24 Universiteit Gent Neural network model for cochlear mechanics and processing
CN113924786B (en) * 2019-06-09 2024-03-29 根特大学 Neural network model for cochlear mechanics and processing
CN112201259A (en) * 2020-09-23 2021-01-08 北京百度网讯科技有限公司 Sound source positioning method, device, equipment and computer storage medium
CN112201259B (en) * 2020-09-23 2022-11-25 北京百度网讯科技有限公司 Sound source positioning method, device, equipment and computer storage medium

Similar Documents

Publication Publication Date Title
Vecchiotti et al. End-to-end binaural sound localisation from the raw waveform
Qian et al. Very deep convolutional neural networks for noise robust speech recognition
CN110517705B (en) Binaural sound source positioning method and system based on deep neural network and convolutional neural network
CN111833896B (en) Voice enhancement method, system, device and storage medium for fusing feedback signals
CN110728989B (en) Binaural speech separation method based on long-time and short-time memory network L STM
CN109215665A (en) A kind of method for recognizing sound-groove based on 3D convolutional neural networks
CN105575403A (en) Cross-correlation sound source positioning method with combination of auditory masking and double-ear signal frames
CN107507625A (en) Sound source distance determines method and device
Li et al. Sams-net: A sliced attention-based neural network for music source separation
CN109862498A (en) A kind of digital deaf-aid sound source direction method based on convolutional neural networks
CN112885375A (en) Global signal-to-noise ratio estimation method based on auditory filter bank and convolutional neural network
CN110501673A (en) A kind of binaural sound source direction in space estimation method and system based on multitask time-frequency convolutional neural networks
CN109831732A (en) Intelligent chauvent's criterion device and method based on smart phone
CN109448702A (en) Artificial cochlea's auditory scene recognition methods
Lin et al. Bionic optimization of MFCC features based on speaker fast recognition
WO2020062679A1 (en) End-to-end speaker diarization method and system employing deep learning
CN112397090B (en) Real-time sound classification method and system based on FPGA
CN105609099A (en) Speech recognition pretreatment method based on human auditory characteristic
Krecichwost et al. Automated detection of sigmatism using deep learning applied to multichannel speech signal
CN106128480B (en) The method that a kind of pair of noisy speech carries out voice activity detection
CN112908353A (en) Voice enhancement method for hearing aid by combining edge computing and cloud computing
CN113327589B (en) Voice activity detection method based on attitude sensor
CN115472168A (en) Short-time voice voiceprint recognition method, system and equipment coupling BGCC and PWPE characteristics
CN114550701A (en) Deep neural network-based Chinese electronic larynx voice conversion device and method
Elmahdy et al. Subvocal speech recognition via close-talk microphone and surface electromyogram using deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190607

WD01 Invention patent application deemed withdrawn after publication