CN109862498A - A kind of digital deaf-aid sound source direction method based on convolutional neural networks - Google Patents
A kind of digital deaf-aid sound source direction method based on convolutional neural networks Download PDFInfo
- Publication number
- CN109862498A CN109862498A CN201910077998.8A CN201910077998A CN109862498A CN 109862498 A CN109862498 A CN 109862498A CN 201910077998 A CN201910077998 A CN 201910077998A CN 109862498 A CN109862498 A CN 109862498A
- Authority
- CN
- China
- Prior art keywords
- hearing aid
- neural networks
- convolutional neural
- voice data
- intelligent terminal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Circuit For Audible Band Transducer (AREA)
Abstract
The present invention discloses a kind of digital deaf-aid sound source direction method based on convolutional neural networks, comprising steps of the voice data of production training, is played the voice data and acquired using hearing aid and intelligent terminal;It is that output carries out neural metwork training, and the convolutional neural networks after the completion of training are inputted in intelligent terminal using the voice data that the training that hearing aid and intelligent terminal acquire is used as input, Sounnd source direction data;When use, hearing aid and intelligent terminal are established and communicated to connect, hearing aid sends intelligent terminal to after receiving external voice, and the convolutional neural networks send back to hearing aid after exporting Sounnd source direction data according to the received voice data of hearing aid and the voice data of intelligent terminal real-time reception.The present invention can accurately obtain directional information of the sound source relative to hearing aid user.
Description
Technical field
The present invention relates to sound source direction technical fields, more particularly to a kind of digital deaf-aid based on convolutional neural networks
Sound source direction method.
Background technique
Hearing loss is one of several chronic diseases common now, is especially mainly in the elderly.According to the World Health Organization
The information announced in 2018, the whole world have 4.66 hundred million people with disabling hearing loss, cause to spend up to 750,000,000,000 dollars every year,
Over-65s the elderly more than one third suffers from disabling hearing loss.Influence of the hearing loss to patient be it is huge,
For example, it is light, in, the elderly of severe hearing loss, the illness rate of Alzheimer's disease is 2 times of the normal old man of hearing, 3 respectively
Times and 5 times.And hearing loss is also possible to that the extremely serious psychological problems such as severe insomnia, cognitive decline, depression can be caused.
Hearing aid can play the hearing improved of hearing loss person certain booster action, and to hearing loss
Recovery also has great help.Therefore, the World Health Organization suggests that disabling hearing loss person wears suitable hearing aid.Tradition
Analog aids use linear amplifier circuit, all audio signals of input are subjected to Linear Amplifer processing.But due to
Hearing aid user is often insensitive to voice signal, can usually generate the awkward situation of " small sound is not heard, loudly listens to feel bad ".For
This problem is solved, the digital deaf-aid that can be amplified according to user's difference hearing loss situation comes into being.Number helps
It listens device to need to carry out a series of processing to the voice signal received, however often exists in the living environment of hearing aid user
Each noise like generates tremendous influence to the voice processing effect of hearing aid.If hearing aid can be light as human brain
Pine determines the direction of sound source, so that it may carry out voice direction enhancing etc. and be further processed, greatly improve the usage experience of user.
But it existing digital deaf-aid or does not utilize sound source direction technology now or is utilized that calculating is complicated but effect is general
Traditional sound source direction technology, there are also very big rooms for promotion for voice processing effect.
Summary of the invention
In view of the technical drawbacks of the prior art, it is an object of the present invention to provide one kind to be based on convolutional neural networks
Digital deaf-aid sound source direction method construct a kind of better hearing aid sound of effect by convolutional neural networks model
Source orientation method, it is intended to the problems such as solving insufficient existing hearing aid Sounnd source direction information, Speech processing less effective, to
It improves the usage experience of hearing aid user and hearing aid sound source direction accuracy rate can be improved.
The technical solution adopted to achieve the purpose of the present invention is:
A kind of digital deaf-aid sound source direction method based on convolutional neural networks, comprising the following steps:
The voice data for making training is played the voice data and is acquired using hearing aid and intelligent terminal;
Using hearing aid and the voice data used of training of intelligent terminal acquisition as input, Sounnd source direction data be export into
Row neural metwork training, and will be in the convolutional neural networks input intelligent terminal after the completion of training;
It is communicated to connect in use, hearing aid and intelligent terminal are established, hearing aid sends intelligence to after receiving external voice
Energy terminal, the convolutional neural networks are according to the received voice data of hearing aid and the voice number of intelligent terminal real-time reception
According to, export Sounnd source direction data after send back to hearing aid.
Before inputting voice data into the convolutional neural networks, further include the steps that voice data pre-treatment:
Voice data is pre-processed first, then phonic signal character is extracted with mel cepstrum coefficients method, voice signal is passed through into Meier
Filter group is converted into 24 dimensional feature signals;Frame Size Adjustment when by voice framing, the letter for generating Meier filter group
Number be 24 × 24 Data Cube;
By the left ear amplitude information in hearing aid or so whispering voice data, left ear phase information, auris dextra range signal, auris dextra
The data of five dimensions of the amplitude information in phase information and intelligent terminal voice data, are integrated into one 24 × 24 × 5
Matrix is inputted as convolutional neural networks, obtains Sounnd source direction data.
The convolutional neural networks are seven-layer structure, are followed successively by the first convolutional layer, pond layer, second from output is input to
Convolutional layer, three layers of full articulamentum, softmax layers will obtain 90 dimensional vectors after inputting 24 × 24 × 5 voice data matrix
Output.
Compared with prior art, the beneficial effects of the present invention are:
The present invention is pre-processed by same section of voice signal for receiving the microphone of hearing aid and intelligent terminal
After integration, transfers to that the convolutional neural networks finished has been trained to carry out operation, obtain side of the sound source relative to hearing aid user
To information, so that hearing aid carries out the further speech processes such as voice de-noising, speech recognition.
Detailed description of the invention
Fig. 1 is the flow chart of the digital deaf-aid sound source direction method the present invention is based on convolutional neural networks.
Fig. 2 is the structural schematic diagram of convolutional neural networks used in the present invention.
Specific embodiment
The present invention is described in further detail below in conjunction with the drawings and specific embodiments.It should be appreciated that described herein
Specific embodiment be only used to explain the present invention, be not intended to limit the present invention.
Method is determined the present invention is based on convolutional neural networks and using the digital deaf-aid Sounnd source direction that intelligent terminal is assisted,
It is handed over after being pre-processed and integrated by same section of voice signal for receiving the microphone of hearing aid and intelligent terminal
By having trained the convolutional neural networks finished to carry out operation, to obtain directional information of the sound source relative to hearing aid user
's.
As shown in Figure 1, the present invention is based on the digital deaf-aid sound source direction method of convolutional neural networks, including following step
It is rapid:
The voice data for making training is played the voice data and is acquired using hearing aid and intelligent terminal;
Using hearing aid and the voice data of intelligent terminal acquisition as the input of convolutional neural networks, Sounnd source direction data are
Output, is trained convolutional neural networks, and will be in the convolutional neural networks input intelligent terminal after the completion of training;
In use, by hearing aid and intelligent terminal by bluetooth connection, when hearing aid and intelligent terminal are in connection status
When, the voice data being collected into is sent intelligent terminal by hearing aid, and convolutional neural networks described in intelligent terminal are then with hearing aid
The voice data being collected into and intelligent terminal itself are received voice data by device, for input, export Sounnd source direction Data Concurrent
Back to hearing aid.
Wherein, after hearing aid receives bearing data, Digital Signal Processing (the Digital Signal of hearing aid
Processing, DSP) module carries out further speech processes accordingly.
Specifically, can be with the voice data from TIMIT corpus Yu NoiseX-92 noise library, by the two according to
Different signal-to-noise ratio mixing are used as training data, play and resurvey these voice data using hearing aid and intelligent terminal, with
These training datas are inputted computer later by actual conditions when simulated hearing aid uses.Wherein, it is mentioned in the computer
It is preceding to write neural network procedure using TensorFlow machine learning frame.The voice data input computer resurveyed
Afterwards, according to the training method of convolutional neural networks, the network is trained by training data.When training reaches changing for setting
After generation number, the convolutional neural networks that finish of training are saved, then are imported in the mating application program app of terminal of hearing aid for making
With.
In use, hearing aid and intelligent terminal are attached by bluetooth first.When the two is in connection status, such as
Voice activity detection (Voice Activity Detection, VAD) module in fruit hearing aid detects voice data, then will
The voice data that left and right microphone is collected into is sent to terminal.Voice that terminal is received using itself and that hearing aid is sent
Data determine the direction of sound source using convolutional neural networks sound source direction algorithm of the present invention, and bearing data is sent back to
To hearing aid, so that hearing aid Digital Signal Processing (Digital Signal Processing, DSP) module carries out further
Speech processes.
The present invention is using the microphone of hearing aid and intelligent terminal collectively as voice signal source.When detect voice believe
Number when, collect the voice data of three sections of separate sources such as left ear hearing aid, auris dextra hearing aid, terminal microphone respectively.It is right first
Every section of voice data is pre-processed, i.e. the operations such as adding window, framing and Short Time Fourier Transform.Recycle mel cepstrum coefficients method
Phonic signal character is extracted, by voice signal by Meier filter group, is converted into 24 dimensional feature signals.By to voice framing
When frame length be adjusted, make Meier filter group generate signal be exactly 24 × 24 Data Cube, be convenient for convolutional Neural
Network is handled.Since the position of left and right ear hearing aid determines relatively, and intelligent terminal can be generated with respect to the position of hearing aid
Variation, therefore the signal that the present invention obtains left and right ear hearing aid retains amplitude information and phase information as main signal;It will
The signal that terminal microphone obtains only retains amplitude information as auxiliary signal.By left ear amplitude, left ear phase, auris dextra amplitude,
The data of five dimensions such as auris dextra phase and terminal amplitude, are integrated into one 24 × 24 × 5 matrix, transfer to convolutional neural networks
Operation is carried out, the direction value of sound source may finally be obtained.
The intelligent terminal can be mobile phone, tablet computer or other Portable intelligent terminal devices.
The convolutional neural networks that the present invention uses be seven-layer structure, from be input to output be followed successively by convolutional layer 1, pond layer,
2, three layers of convolutional layer full articulamentum, softmax layers, input 24 × 24 × 5 voice data matrix after will obtain one 90 tie up to
Amount output.
It is the concrete operation method of each layer of convolutional neural networks used in the present invention below:
Step 1: passing through convolutional layer 1.The convolutional layer filter size used is 5 × 5, depth 5, step-length 1.According to defeated
Matrix Computation Formulas out,
Out in formula, in, filter, step respectively indicate this layer output, input, filter, step-length, and subscript l and w then divides
The length and width of this attribute are not indicated.From the above equation, we can see that can be obtained after 24 × 24 × 5 primary data Input matrix layer
20 × 20 × 5 output.For each 1 × 1 × 5 unit-node matrix, useWith ax,y,zIt respectively indicates for this
I-th of node in unit-node matrix, the weight and input value of corresponding input node (x, y, z), uses biIndicate i-th of section
The corresponding biasing of point.Then i-th of node value H (i) in the unit-node are as follows:
F (x)=ReLU (x)=max (0, x)
F (x) in formula is common activation primitive line rectification function (Rectified Linear in neural network
Unit,ReLU)。
Step 2: passing through pond layer.This layer is 2 maximum pond layer using step-length, and input matrix is split as multiple 2 × 2
Submatrix, retain four elements of submatrix in maximum value, and form a new matrix.Each depth is used identical
Splitting and reorganizing operation, after the matrix of input 20 × 20 × 5, by the output matrix of this layer of operation available 10 × 10 × 5.
Step 3: passing through convolutional layer 2.The convolutional layer filter size is 5 × 5, depth 16, step-length 1.Using with step
The value of node is changed to (i=1,2 ..., 16) by rapid 1 similar method and formula, by this layer of operation available 6 × 6 × 16
Output matrix.
Step 4: passing through full articulamentum 1.1 × 1 × 360 output matrix, i.e. output node number are obtained after this layer of operation
It is 360, wherein i-th of output node g (i) value are as follows:
Step 5: passing through full articulamentum 2.This layer of input node number is 360, and output node number is 180.It usesWith
axThe weight and input value for i-th of output node are respectively indicated, then i-th of output node g (i) value are as follows:
Step 6: passing through full articulamentum 3.This layer of input node number is 180, and output node number is 90, calculation method with
Step 5 is similar.A softmax (normalization exponential function) layer is reused after the output layer, converts output g (i) to generally
Rate distribution output o (i), it may be assumed that
By the operation of convolutional neural networks, 90 dimensional vectors for meeting probability distribution have been finally obtained.By direction week
Angle is divided into 90 parts of 4 ° of interval, and 90 dimensional vectors respectively correspond from front and originate clockwise 90 directions.When certain in vector
When the numerical value tieed up a bit is significantly more than other dimension values, it is believed that sound source is located at the direction.
In the present invention, when hearing aid VAD module detects voice signal, hearing aid or so ear and intelligent terminal microphone
Voice signal will be received simultaneously and pre-processed, and then transfer to that the convolutional neural networks finished trained to carry out operation, thus
It can obtain directional information of the sound source relative to hearing aid user.
The present invention applies to convolutional neural networks in hearing aid sound source direction algorithm, since convolutional neural networks possess directly
It connects and the advantage that convolution extracts feature is carried out to voice frequency point, therefore can to greatly improve hearing aid sound source direction accurate by the present invention
Rate greatly promotes the usage experience of hearing aid user in order to which speech-oriented enhancing, identification etc. are further processed.
The above is only a preferred embodiment of the present invention, it is noted that for the common skill of the art
For art personnel, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications
Also it should be regarded as protection scope of the present invention.
Claims (3)
1. a kind of digital deaf-aid sound source direction method based on convolutional neural networks, which comprises the following steps:
The voice data for making training is played the voice data and is acquired using hearing aid and intelligent terminal;
It is that output carries out mind using the voice data that the training of hearing aid and intelligent terminal acquisition is used as input, Sounnd source direction data
Through network training, and will be in the convolutional neural networks input intelligent terminal after the completion of training;
It is communicated to connect in use, hearing aid and intelligent terminal are established, hearing aid sends intelligent end to after receiving external voice
End, the convolutional neural networks are defeated according to the received voice data of hearing aid and the voice data of intelligent terminal real-time reception
Hearing aid is sent back to after sound source bearing data out.
2. the digital deaf-aid sound source direction method based on convolutional neural networks as described in claim 1, which is characterized in that inciting somebody to action
Voice data is input to before the convolutional neural networks, is further included the steps that voice data pre-treatment: first to voice number
Then Data preprocess extracts phonic signal character with mel cepstrum coefficients method, voice signal is converted by Meier filter group
For 24 dimensional feature signals;Frame Size Adjustment when by voice framing, the signal 24 × 24 for generating Meier filter group
Data Cube;
By left ear amplitude information, left ear phase information, the auris dextra range signal, auris dextra phase in hearing aid or so whispering voice data
The data of five dimensions of the amplitude information in information and intelligent terminal voice data, are integrated into one 24 × 24 × 5 matrix
It is inputted as convolutional neural networks, obtains Sounnd source direction data.
3. the digital deaf-aid sound source direction method based on convolutional neural networks as claimed in claim 2, which is characterized in that described
Convolutional neural networks be seven-layer structure, be followed successively by the first convolutional layer, pond layer, the second convolutional layer, three layers from output is input to
Full articulamentum, softmax layers will obtain a 90 dimensional vectors output after inputting 24 × 24 × 5 voice data matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910077998.8A CN109862498A (en) | 2019-01-28 | 2019-01-28 | A kind of digital deaf-aid sound source direction method based on convolutional neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910077998.8A CN109862498A (en) | 2019-01-28 | 2019-01-28 | A kind of digital deaf-aid sound source direction method based on convolutional neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109862498A true CN109862498A (en) | 2019-06-07 |
Family
ID=66896296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910077998.8A Pending CN109862498A (en) | 2019-01-28 | 2019-01-28 | A kind of digital deaf-aid sound source direction method based on convolutional neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109862498A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112201259A (en) * | 2020-09-23 | 2021-01-08 | 北京百度网讯科技有限公司 | Sound source positioning method, device, equipment and computer storage medium |
CN113924786A (en) * | 2019-06-09 | 2022-01-11 | 根特大学 | Neural network model for cochlear mechanics and processing |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106331973A (en) * | 2016-10-20 | 2017-01-11 | 天津大学 | Realization method of hearing-aid filter bank based on portable terminal |
WO2017191249A1 (en) * | 2016-05-06 | 2017-11-09 | Robert Bosch Gmbh | Speech enhancement and audio event detection for an environment with non-stationary noise |
CN108024188A (en) * | 2017-09-30 | 2018-05-11 | 天津大学 | A kind of high intelligibility voice de-noising algorithm based on intelligent terminal |
CN108122559A (en) * | 2017-12-21 | 2018-06-05 | 北京工业大学 | Binaural sound sources localization method based on deep learning in a kind of digital deaf-aid |
CN108717178A (en) * | 2018-04-12 | 2018-10-30 | 福州瑞芯微电子股份有限公司 | A kind of sound localization method and device based on neural network |
-
2019
- 2019-01-28 CN CN201910077998.8A patent/CN109862498A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017191249A1 (en) * | 2016-05-06 | 2017-11-09 | Robert Bosch Gmbh | Speech enhancement and audio event detection for an environment with non-stationary noise |
CN106331973A (en) * | 2016-10-20 | 2017-01-11 | 天津大学 | Realization method of hearing-aid filter bank based on portable terminal |
CN108024188A (en) * | 2017-09-30 | 2018-05-11 | 天津大学 | A kind of high intelligibility voice de-noising algorithm based on intelligent terminal |
CN108122559A (en) * | 2017-12-21 | 2018-06-05 | 北京工业大学 | Binaural sound sources localization method based on deep learning in a kind of digital deaf-aid |
CN108717178A (en) * | 2018-04-12 | 2018-10-30 | 福州瑞芯微电子股份有限公司 | A kind of sound localization method and device based on neural network |
Non-Patent Citations (1)
Title |
---|
谈雅文,王立杰,姚昕羽,汤一彬,周琳: "基于BP神经网络的双耳声源定位算法", 《电声技术》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113924786A (en) * | 2019-06-09 | 2022-01-11 | 根特大学 | Neural network model for cochlear mechanics and processing |
US11800301B2 (en) | 2019-06-09 | 2023-10-24 | Universiteit Gent | Neural network model for cochlear mechanics and processing |
CN113924786B (en) * | 2019-06-09 | 2024-03-29 | 根特大学 | Neural network model for cochlear mechanics and processing |
CN112201259A (en) * | 2020-09-23 | 2021-01-08 | 北京百度网讯科技有限公司 | Sound source positioning method, device, equipment and computer storage medium |
CN112201259B (en) * | 2020-09-23 | 2022-11-25 | 北京百度网讯科技有限公司 | Sound source positioning method, device, equipment and computer storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Vecchiotti et al. | End-to-end binaural sound localisation from the raw waveform | |
Qian et al. | Very deep convolutional neural networks for noise robust speech recognition | |
CN110517705B (en) | Binaural sound source positioning method and system based on deep neural network and convolutional neural network | |
CN111833896B (en) | Voice enhancement method, system, device and storage medium for fusing feedback signals | |
CN110728989B (en) | Binaural speech separation method based on long-time and short-time memory network L STM | |
CN109215665A (en) | A kind of method for recognizing sound-groove based on 3D convolutional neural networks | |
CN105575403A (en) | Cross-correlation sound source positioning method with combination of auditory masking and double-ear signal frames | |
CN107507625A (en) | Sound source distance determines method and device | |
Li et al. | Sams-net: A sliced attention-based neural network for music source separation | |
CN109862498A (en) | A kind of digital deaf-aid sound source direction method based on convolutional neural networks | |
CN112885375A (en) | Global signal-to-noise ratio estimation method based on auditory filter bank and convolutional neural network | |
CN110501673A (en) | A kind of binaural sound source direction in space estimation method and system based on multitask time-frequency convolutional neural networks | |
CN109831732A (en) | Intelligent chauvent's criterion device and method based on smart phone | |
CN109448702A (en) | Artificial cochlea's auditory scene recognition methods | |
Lin et al. | Bionic optimization of MFCC features based on speaker fast recognition | |
WO2020062679A1 (en) | End-to-end speaker diarization method and system employing deep learning | |
CN112397090B (en) | Real-time sound classification method and system based on FPGA | |
CN105609099A (en) | Speech recognition pretreatment method based on human auditory characteristic | |
Krecichwost et al. | Automated detection of sigmatism using deep learning applied to multichannel speech signal | |
CN106128480B (en) | The method that a kind of pair of noisy speech carries out voice activity detection | |
CN112908353A (en) | Voice enhancement method for hearing aid by combining edge computing and cloud computing | |
CN113327589B (en) | Voice activity detection method based on attitude sensor | |
CN115472168A (en) | Short-time voice voiceprint recognition method, system and equipment coupling BGCC and PWPE characteristics | |
CN114550701A (en) | Deep neural network-based Chinese electronic larynx voice conversion device and method | |
Elmahdy et al. | Subvocal speech recognition via close-talk microphone and surface electromyogram using deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190607 |
|
WD01 | Invention patent application deemed withdrawn after publication |