CN108962230B - Audio recognition method based on memristor - Google Patents

Audio recognition method based on memristor Download PDF

Info

Publication number
CN108962230B
CN108962230B CN201810843588.5A CN201810843588A CN108962230B CN 108962230 B CN108962230 B CN 108962230B CN 201810843588 A CN201810843588 A CN 201810843588A CN 108962230 B CN108962230 B CN 108962230B
Authority
CN
China
Prior art keywords
memristor
speech recognition
training
speech
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810843588.5A
Other languages
Chinese (zh)
Other versions
CN108962230A (en
Inventor
李正浩
李琪
唐永亮
李靖禾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Inple Technology Co Ltd
Original Assignee
Chongqing Inple Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Inple Technology Co Ltd filed Critical Chongqing Inple Technology Co Ltd
Priority to CN201810843588.5A priority Critical patent/CN108962230B/en
Publication of CN108962230A publication Critical patent/CN108962230A/en
Application granted granted Critical
Publication of CN108962230B publication Critical patent/CN108962230B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses a kind of audio recognition methods based on memristor, including memristor device selection and memristor neural network unit to build;Construct memristor speech recognition network to be trained;Speech training signal extremely memristor speech recognition network to be trained is inputted, and its recognition result is back to FPGA;FPGA calculates the deviation of recognition result and speech training signal, and treats trained memristor speech recognition network and be trained, memristor speech recognition network needed for obtaining;It inputs voice signal to be identified, obtains voice signal content text.Its remarkable result is: not only reducing on hardware to precision and preparation difficulty, the influence of time, the more advanced multilayer convolutional neural networks of hardware realization;And speech recognition and memristor are effectively combined, by the way of connect in batches, reduce the interference of hardware crosstalk, so that the accuracy of speech recognition greatly improves.

Description

Audio recognition method based on memristor
Technical field
The present invention relates to technical field of voice recognition neural network based, specifically, being a kind of based on memristor Audio recognition method.
Background technique
Memristor, full name memory resistor (Memristor), it is the circuit devcie for indicating magnetic flux and charge relationship.Memristor Dimension with resistance, but with resistance unlike, the resistance value of memristor is determined by the charge for flowing through it.Therefore, pass through measurement The resistance value of memristor can know the quantity of electric charge for flowing through it, to play the role of remembering charge.The appearance of nanometer memory resistor, has It hopes and realizes non-volatile RAM.Also, the integrated level of the random access memory based on memristor, power consumption, read or write speed will It is more superior than traditional random access memory.In addition, memristor is the best way of hardware realization artificial neural network cynapse.
2009, the researcher of National Institute of Standards and Technology (NIST) realized a kind of low-power consumption flexibility note Recall circuit, which shows the ability that memory flows through internal current total amount in the impedance of circuit.The same year, Michigan, United States are big Develop a kind of chip being made of memristor, which can store 103 bit informations, this makes to develop, and volume is smaller, speed Faster, the lower chip of price is possibly realized.May in the same year, Pascal O Vontobel et al. prove memristor can with it is big Identical Spike Timing Dependent Plastivity (STDP) the mode respective synchronization voltage pulse of brain, for building mind It provides the foundation condition through network.2017, Alexander Serb and Johannes Bill et al. described memristor element Resistance variations situation under LTP, LTD pulsed drive, and demonstrate memristor and can be realized answering for simple neural network model With.In the same year, Peng Yao et al., which demonstrates memristor, can carry out recognition of face classification with neural network model.Memristor is excellent Performance have extensive development space, with the propulsion of research, the meaning of memristor application study is increasingly significant.
Although existing scholar realizes the application of neural network on memristor array, the neural network more base realized Plinth, method and practical application still have a distance.It is asked simultaneously that noise is not serious brought by the crosstalk of very good solution hardware There is also white spaces for the connection type of topic, software and hardware, also rarely have for application of the speech recognition in terms of memristor It is related to.
Summary of the invention
In view of the deficiencies of the prior art, the object of the present invention is to provide a kind of audio recognition method based on memristor, energy Requirement of the neural network for equipment precision is enough quantitatively calculated, and builds the speech recognition network based on memristor, is indirectly dropped The low required precision to hardware preparation, realizes accurate speech recognition.
In order to achieve the above objectives, The technical solution adopted by the invention is as follows:
A kind of audio recognition method based on memristor, it is critical that including the following steps:
Step 1: the recognition accuracy that speech recognition network emulates under different accuracy equipment is calculated, it is quasi- according to reasonable identification Lowest accuracy standard carries out memristor device selection in true rate and memristor neural network unit is built;
Step 2: according to speech recognition network, carry out the connection of FPGA and multiple memristor neural network units, construct to Training memristor speech recognition network;
Step 3: input speech training signal to the memristor speech recognition network to be trained, and its recognition result is returned It is back to the training module of FPGA;
The training module of step 4:FPGA, which is realized, calculates recognition result and speech training signal according to the recognition result of input The deviation of label, and trained memristor speech recognition network is treated according to the write-in rule of backpropagation function and memristor array It is trained, memristor speech recognition network needed for obtaining;
Step 5: inputting voice signal to be identified into the FPGA of above-mentioned required memristor speech recognition network, obtain voice Signal content text.
Further, the acquisition process of lowest accuracy standard described in step 1 is as follows:
Step 1.1: building speech recognition network and training with computer;
Step 1.2: setting a series of precision formats for the resulting convolution kernel value of training, and carry out emulation testing;
Step 1.3: recognition accuracy curve being drawn according to test result, and chooses the minimum essence in reasonable recognition accuracy Spend the lowest accuracy standard as equipment.
Further, the obtaining step of memristor speech recognition network to be trained described in step 2 are as follows:
Step 2.1: all arithmetic logics other than point multiplication operation in speech recognition network being determined on FPGA first Justice;
Step 2.2: the first convolution+pond layer is built based on the memristor neural network unit;
Step 2.3: the second convolution+pond layer is built based on the memristor neural network unit;
Step 2.4: full articulamentum is built based on the memristor neural network unit;
Step 2.5: the input/output port of the first convolution+pond layer, the second convolution+pond layer and full articulamentum is connected It is connected on the I/O mouth of FPGA, obtains memristor speech recognition network to be trained.
Further, the point multiplication operation is carried out based on the memristor neural network unit, the arithmetic logic packet Include activation primitive, point multiplication operation unit is added in batches, feedback control training, signal input and output quantization, training pulse voltage control System.
Further, the memristor neural network unit be a × a arrangement memristor matrix array, wherein a be greater than Integer equal to 1.
Further, the memristor neural network unit is arranged by 9 memristors by 3 × 3 matrix forms.
Further, first convolution+pond layer, the second convolution+pond layer and full articulamentum described are recalled by multiple Resistance device neural network unit is arranged by p × q matrix form, and wherein p, q are the integer greater than 1 being not mutually equal.
Further, first convolution+pond layer and the second convolution+pond layer are by 234 memristor nerve nets Network unit is arranged by 9 × 26 matrix forms;The full articulamentum presses 100 by 400 memristor neural network units × 4 matrix forms arrange.
Further, described described in step 3 after training memristor speech recognition network inputs speech training signal FPGA carries out MFCC extraction and quantification treatment to speech training signal, is translated into an input for memristor neural network unit Signal.
Further, the number of the training is no less than 60 times.
Remarkable result of the invention is: first by self-made tool case, calculate and judge the memristor device being able to use, Then it using small-scale memristor array as basic array, carries out being arranged to make up big processing array in a certain way, build At the network for speech recognition, not only reduce on hardware to precision and preparation difficulty, the influence of time, hardware realization More advanced multilayer convolutional neural networks;And speech recognition and memristor are effectively combined, using connecting in batches Mode, the interference of hardware crosstalk is reduced, so that the accuracy of speech recognition greatly improves.
Detailed description of the invention
Fig. 1 is flow chart of the method for the present invention;
Fig. 2 is the recognition accuracy curve graph drawn;
Fig. 3 is the structural schematic diagram of memristor neural network unit;
Fig. 4 is convolution+pond layer memristor array arrangement schematic diagram;
Fig. 5 is the arrangement schematic diagram of full articulamentum memristor array;
Fig. 6 is identification error rate curves figure of the invention.
Specific embodiment
Specific embodiment and working principle of the present invention will be described in further detail with reference to the accompanying drawing.
As shown in Figure 1, a kind of audio recognition method based on memristor, the specific steps are as follows:
Step 1: the recognition accuracy that speech recognition network emulates under different accuracy equipment is calculated, it is quasi- according to reasonable identification Lowest accuracy standard carries out memristor device selection in true rate and memristor neural network unit is built;
The acquisition process of the lowest accuracy standard is as follows:
Step 1.1: building speech recognition network and training with computer;
Step 1.2: setting a series of precision formats for the resulting convolution kernel value of training, and carry out emulation testing;
Step 1.3: recognition accuracy curve as shown in Figure 2 being drawn according to test result, and according to acceptable discrimination Lowest accuracy is chosen, as the minimum standard to device precision.Device precision by continuously adjust the mould of middle resistance fluctuation range Lai It indicates.
Preferably, the memristor neural network unit is the rectangle memristor array of 3 × 3 sizes, i.e., by 9 memristors It is arranged in an integrated chip by the matrix form of every row 3, each column 3 namely 9 memristor combination cmos and transistor Etc. be arranged in 3 rows 3 column crossbar on, wherein cross bar (row) receives voltage input i.e. V1, V2, V3, and vertical bar (column) provides electric current Output is U1, U2, U3, and manufactured memristor neural network unit is as shown in Figure 3;
Step 2: according to speech recognition network, carrying out FPGA (field-programmable gate array) and multiple memristor neural networks The connection of unit constructs memristor speech recognition network to be trained, specific steps are as follows:
Step 2.1: all arithmetic logics other than point multiplication operation in speech recognition network being determined on FPGA first Justice;Including activation primitive part, point multiplication operation unit, adding section, feedback control training part, signal input are defeated in batches Quantized segment, training pulse voltage control division point etc. out.
Step 2.2: 234 memristor neural network units being arranged by 9 × 26 matrix forms, according to shown in Fig. 4 Build the first convolution+pond layer;
Step 2.3: 234 memristor neural network units being arranged by 9 × 26 matrix forms, according to shown in Fig. 4 Build the second convolution+pond layer;
Step 2.4: 400 memristor neural network units being arranged by 100 × 4 matrix forms, according to shown in Fig. 5 Build full articulamentum;
Step 2.5: the input/output port of the first convolution+pond layer, the second convolution+pond layer and full articulamentum is connected It is connected on the I/O mouth of FPGA, obtains memristor speech recognition network to be trained.
When first convolution+pond layer, the second convolution+pond layer carry out convolution algorithm, by the corresponding input square of convolution kernel Battle array part is paved, and is input on nine arrays of each column by column, the corresponding convolution kernel of each column memristor of each column array, The U of each column lower end corresponds to the output of the column convolution kernel.When the full articulamentum carries out full connection operation, most next two columns are not It uses, therefore 100 row arrays correspond to 300 inputs of full articulamentum, corresponding 10 outputs connected entirely of 4 column arrays, lower end U correspond to the output of respective column.The input of every a line array is identical.
Step 3: input speech training signal to the memristor speech recognition network to be trained, the FPGA instruct voice Practice signal and carry out MFCC extraction and quantification treatment, be translated into the input signal of each memristor neural network unit, and by its Recognition result is back to the training module of FPGA;
When being tested, MFCC extraction and quantization are carried out to signal with the training module of FPGA, are converted into memristor module Input signal inputs memristor speech recognition network.In training mode, also copes with memristor speech recognition network to be trained and return The output signal returned is quantified again.
Wherein, after input signal is quantified, output characteristic spectrum calculation method is as follows:
It is in j-th of input feature vector spectrum of l layer, f (x) is activation equation, MjIndicate input feature vector spectrum set,It is l The convolution kernel of layer corresponding i-th of input feature vector spectrum and j-th of output characteristic spectrum,J-th of output characteristic spectrum is corresponded at l layers Biasing.
The training module of step 4:FPGA, which is realized, calculates recognition result and speech training signal according to the recognition result of input The deviation of label, and trained memristor speech recognition network is treated according to the write-in rule of backpropagation function and memristor array It is trained, and the number of training is no less than 60 times, memristor speech recognition network needed for obtaining;
In this example, deviation calculating is carried out by the subtracter array that FPGA is programmed, then pass through setting backpropagation letter Number will need the weighted value changed to be converted into memristor and adjust the control voltage needed and adjust pulse.
In addition, being evaluated using mean square error equation training result in this programme, whether to determine training precision It is up to standard, then the memristor speech recognition network after training is trained again if it does not meet the standards, until precision is up to standard;
Mean square error calculation formula are as follows:
EnCorresponding n-th of sample,It is k-th of the output of n-th of sample,K-th of label of corresponding n-th of sample, k ∈ (1, c), c are the classification numbers that classification sum is also output.
Step 5: inputting voice signal to be identified into the FPGA of above-mentioned required memristor speech recognition network, obtain voice Signal content text.
Fixed required memristor speech recognition network after the completion of training, then voice signal to be measured is inputted to FPGA, it is identifying Under the calculating of network, voice signal text can be obtained, output mode can voluntarily be converted according to practical application.
In the present embodiment, application training number be 60 resulting speech recognition neural networks, for indicate ten kinds not Totally 2000 tested speech with instruction are identified, the identification error rate of acquisition as shown in fig. 6, it is not difficult to find out that, the present invention It is innovatively successfully realized while can make a choice as needed to device and precision with satisfactory accuracy The increasingly complex speech recognition neural network with good recognition accuracy.

Claims (8)

1. a kind of audio recognition method based on memristor, which comprises the steps of:
Step 1: the recognition accuracy that speech recognition network emulates under different accuracy equipment is calculated, according to reasonable recognition accuracy Middle lowest accuracy standard carries out memristor device selection and memristor neural network unit is built;
Step 2: according to speech recognition network, carrying out the connection of FPGA and multiple memristor neural network units, construct wait train Memristor speech recognition network;
Step 3: input speech training signal to the memristor speech recognition network to be trained, and its recognition result is back to The training module of FPGA;
The training module of step 4:FPGA, which is realized, calculates recognition result and speech training signal label according to the recognition result of input Deviation, and trained memristor speech recognition network is treated according to the write-in rule of backpropagation function and memristor array and is carried out Training, memristor speech recognition network needed for obtaining;
Step 5: inputting voice signal to be identified into the FPGA of above-mentioned required memristor speech recognition network, obtain voice signal Content text;
The acquisition process of lowest accuracy standard described in step 1 is as follows:
Step 1.1: building speech recognition network and training with computer;
Step 1.2: setting a series of precision formats for the resulting convolution kernel value of training, and carry out emulation testing;
Step 1.3: recognition accuracy curve being drawn according to test result, and chooses the work of the lowest accuracy in reasonable recognition accuracy For the lowest accuracy standard of equipment;
The obtaining step of memristor speech recognition network to be trained described in step 2 are as follows:
Step 2.1: all arithmetic logics other than point multiplication operation in speech recognition network being defined on FPGA first;
Step 2.2: the first convolution+pond layer is built based on the memristor neural network unit;
Step 2.3: the second convolution+pond layer is built based on the memristor neural network unit;
Step 2.4: full articulamentum is built based on the memristor neural network unit;
Step 2.5: the input/output port of the first convolution+pond layer, the second convolution+pond layer and full articulamentum is connected to On the I/O mouth of FPGA, memristor speech recognition network to be trained is obtained.
2. the audio recognition method according to claim 1 based on memristor, it is characterised in that: the point multiplication operation is based on The memristor neural network unit carries out, and the arithmetic logic includes that activation primitive, point multiplication operation unit are added in batches, feed back Controlled training, signal input and output quantization, the control of training pulse voltage.
3. the audio recognition method according to claim 1 or 2 based on memristor, it is characterised in that: the memristor mind It is the memristor matrix array of a × a arrangement through network unit, wherein a is the integer more than or equal to 1.
4. the audio recognition method according to claim 1 based on memristor, it is characterised in that: the memristor nerve net Network unit is arranged by 9 memristors by 3 × 3 matrix forms.
5. the audio recognition method according to claim 1 based on memristor, it is characterised in that: first convolution+pond Change layer, the second convolution+pond layer and full articulamentum and is arranged by multiple memristor neural network units by p × q matrix form It arranges, wherein p, q are the integer greater than 1 being not mutually equal.
6. according to claim 1 or 5 based on the audio recognition method of memristor, it is characterised in that: first convolution+ Pond layer is arranged by 234 memristor neural network units by 9 × 26 matrix forms with the second convolution+pond layer; The full articulamentum is arranged by 400 memristor neural network units by 100 × 4 matrix forms.
7. the audio recognition method according to claim 1 based on memristor, it is characterised in that: wait instruct described in step 3 After practicing memristor speech recognition network inputs speech training signal, the FPGA carries out MFCC extraction and amount to speech training signal Change processing, is translated into the input signal of each memristor neural network unit.
8. the audio recognition method according to claim 1 based on memristor, it is characterised in that: the number of the training is not Less than 60 times.
CN201810843588.5A 2018-07-27 2018-07-27 Audio recognition method based on memristor Active CN108962230B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810843588.5A CN108962230B (en) 2018-07-27 2018-07-27 Audio recognition method based on memristor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810843588.5A CN108962230B (en) 2018-07-27 2018-07-27 Audio recognition method based on memristor

Publications (2)

Publication Number Publication Date
CN108962230A CN108962230A (en) 2018-12-07
CN108962230B true CN108962230B (en) 2019-04-23

Family

ID=64465701

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810843588.5A Active CN108962230B (en) 2018-07-27 2018-07-27 Audio recognition method based on memristor

Country Status (1)

Country Link
CN (1) CN108962230B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110378193B (en) * 2019-05-06 2022-09-06 南京邮电大学 Cashmere and wool recognition method based on memristor neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107408111A (en) * 2015-11-25 2017-11-28 百度(美国)有限责任公司 End-to-end speech recognition
CN107993651A (en) * 2017-12-29 2018-05-04 深圳和而泰数据资源与云技术有限公司 A kind of audio recognition method, device, electronic equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9619749B2 (en) * 2014-03-06 2017-04-11 Progress, Inc. Neural network and method of neural network training
WO2016187500A1 (en) * 2015-05-21 2016-11-24 Cory Merkel Method and apparatus for training memristive learning systems
CN105224986B (en) * 2015-09-29 2018-01-23 清华大学 Deep neural network system based on memory resistor
EP3564866A4 (en) * 2016-12-28 2020-03-25 Shanghai Cambricon Information Technology Co., Ltd Computation method
CN107133668A (en) * 2017-04-28 2017-09-05 北京大学 A kind of memristor neural network training method based on fuzzy Boltzmann machine
CN108009640B (en) * 2017-12-25 2020-04-28 清华大学 Training device and training method of neural network based on memristor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107408111A (en) * 2015-11-25 2017-11-28 百度(美国)有限责任公司 End-to-end speech recognition
CN107993651A (en) * 2017-12-29 2018-05-04 深圳和而泰数据资源与云技术有限公司 A kind of audio recognition method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN108962230A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
CN107341539B (en) neural network processing system
Ramasubramanian et al. SPINDLE: SPINtronic deep learning engine for large-scale neuromorphic computing
Wen et al. Memristive fully convolutional network: An accurate hardware image-segmentor in deep learning
CN110991633B (en) Residual error neural network model based on memristor network and application method thereof
CN107704563A (en) A kind of question sentence recommends method and system
CN109165730B (en) State quantization network implementation method in cross array neuromorphic hardware
CN109711883B (en) Internet advertisement click rate estimation method based on U-Net network
CN108805185A (en) Training method, device, storage medium and the computer equipment of model
CN110245709A (en) Based on deep learning and from the 3D point cloud data semantic dividing method of attention
CN108596329A (en) Threedimensional model sorting technique based on end-to-end Deep integrating learning network
Yakopcic et al. Energy efficient perceptron pattern recognition using segmented memristor crossbar arrays
CN106776545A (en) A kind of method that Similarity Measure between short text is carried out by depth convolutional neural networks
CN108962230B (en) Audio recognition method based on memristor
CN114332545B (en) Image data classification method and device based on low-bit pulse neural network
CN110059716A (en) A kind of building of CNN-LSTM-SVM network model and MOOC discontinue one's studies prediction technique
CN114595874A (en) Ultra-short-term power load prediction method based on dynamic neural network
CN110516070A (en) A kind of Chinese Question Classification method based on text error correction and neural network
CN111275168A (en) Air quality prediction method of bidirectional gating circulation unit based on convolution full connection
CN111242380A (en) Lake (reservoir) eutrophication prediction method based on artificial intelligence algorithm
CN109086870A (en) A kind of Three dimensional convolution neural network implementation method based on memristor
CN111784777A (en) SMT material quantity statistical method and statistical system based on convolutional neural network
CN115146580A (en) Integrated circuit path delay prediction method based on feature selection and deep learning
Lee et al. Novel method enabling forward and backward propagations in NAND flash memory for on-chip learning
CN116579447A (en) Time sequence prediction method based on decomposition mechanism and attention mechanism
Oh et al. Neuron circuits for low-power spiking neural networks using time-to-first-spike encoding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant