CN108962230B

CN108962230B - Audio recognition method based on memristor

Info

Publication number: CN108962230B
Application number: CN201810843588.5A
Authority: CN
Inventors: 李正浩; 李琪; 唐永亮; 李靖禾
Original assignee: Chongqing Inple Technology Co Ltd
Current assignee: Chongqing Inple Technology Co Ltd
Priority date: 2018-07-27
Filing date: 2018-07-27
Publication date: 2019-04-23
Anticipated expiration: 2038-07-27
Also published as: CN108962230A

Abstract

The invention discloses a kind of audio recognition methods based on memristor, including memristor device selection and memristor neural network unit to build；Construct memristor speech recognition network to be trained；Speech training signal extremely memristor speech recognition network to be trained is inputted, and its recognition result is back to FPGA；FPGA calculates the deviation of recognition result and speech training signal, and treats trained memristor speech recognition network and be trained, memristor speech recognition network needed for obtaining；It inputs voice signal to be identified, obtains voice signal content text.Its remarkable result is: not only reducing on hardware to precision and preparation difficulty, the influence of time, the more advanced multilayer convolutional neural networks of hardware realization；And speech recognition and memristor are effectively combined, by the way of connect in batches, reduce the interference of hardware crosstalk, so that the accuracy of speech recognition greatly improves.

Description

Audio recognition method based on memristor

Technical field

The present invention relates to technical field of voice recognition neural network based, specifically, being a kind of based on memristor Audio recognition method.

Background technique

Memristor, full name memory resistor (Memristor), it is the circuit devcie for indicating magnetic flux and charge relationship.Memristor Dimension with resistance, but with resistance unlike, the resistance value of memristor is determined by the charge for flowing through it.Therefore, pass through measurement The resistance value of memristor can know the quantity of electric charge for flowing through it, to play the role of remembering charge.The appearance of nanometer memory resistor, has It hopes and realizes non-volatile RAM.Also, the integrated level of the random access memory based on memristor, power consumption, read or write speed will It is more superior than traditional random access memory.In addition, memristor is the best way of hardware realization artificial neural network cynapse.

2009, the researcher of National Institute of Standards and Technology (NIST) realized a kind of low-power consumption flexibility note Recall circuit, which shows the ability that memory flows through internal current total amount in the impedance of circuit.The same year, Michigan, United States are big Develop a kind of chip being made of memristor, which can store 103 bit informations, this makes to develop, and volume is smaller, speed Faster, the lower chip of price is possibly realized.May in the same year, Pascal O Vontobel et al. prove memristor can with it is big Identical Spike Timing Dependent Plastivity (STDP) the mode respective synchronization voltage pulse of brain, for building mind It provides the foundation condition through network.2017, Alexander Serb and Johannes Bill et al. described memristor element Resistance variations situation under LTP, LTD pulsed drive, and demonstrate memristor and can be realized answering for simple neural network model With.In the same year, Peng Yao et al., which demonstrates memristor, can carry out recognition of face classification with neural network model.Memristor is excellent Performance have extensive development space, with the propulsion of research, the meaning of memristor application study is increasingly significant.

Although existing scholar realizes the application of neural network on memristor array, the neural network more base realized Plinth, method and practical application still have a distance.It is asked simultaneously that noise is not serious brought by the crosstalk of very good solution hardware There is also white spaces for the connection type of topic, software and hardware, also rarely have for application of the speech recognition in terms of memristor It is related to.

Summary of the invention

In view of the deficiencies of the prior art, the object of the present invention is to provide a kind of audio recognition method based on memristor, energy Requirement of the neural network for equipment precision is enough quantitatively calculated, and builds the speech recognition network based on memristor, is indirectly dropped The low required precision to hardware preparation, realizes accurate speech recognition.

In order to achieve the above objectives, The technical solution adopted by the invention is as follows:

A kind of audio recognition method based on memristor, it is critical that including the following steps:

Step 1: the recognition accuracy that speech recognition network emulates under different accuracy equipment is calculated, it is quasi- according to reasonable identification Lowest accuracy standard carries out memristor device selection in true rate and memristor neural network unit is built；

Step 2: according to speech recognition network, carry out the connection of FPGA and multiple memristor neural network units, construct to Training memristor speech recognition network；

Step 3: input speech training signal to the memristor speech recognition network to be trained, and its recognition result is returned It is back to the training module of FPGA；

The training module of step 4:FPGA, which is realized, calculates recognition result and speech training signal according to the recognition result of input The deviation of label, and trained memristor speech recognition network is treated according to the write-in rule of backpropagation function and memristor array It is trained, memristor speech recognition network needed for obtaining；

Step 5: inputting voice signal to be identified into the FPGA of above-mentioned required memristor speech recognition network, obtain voice Signal content text.

Further, the acquisition process of lowest accuracy standard described in step 1 is as follows:

Step 1.1: building speech recognition network and training with computer；

Step 1.2: setting a series of precision formats for the resulting convolution kernel value of training, and carry out emulation testing；

Step 1.3: recognition accuracy curve being drawn according to test result, and chooses the minimum essence in reasonable recognition accuracy Spend the lowest accuracy standard as equipment.

Further, the obtaining step of memristor speech recognition network to be trained described in step 2 are as follows:

Step 2.1: all arithmetic logics other than point multiplication operation in speech recognition network being determined on FPGA first Justice；

Step 2.2: the first convolution+pond layer is built based on the memristor neural network unit；

Step 2.3: the second convolution+pond layer is built based on the memristor neural network unit；

Step 2.4: full articulamentum is built based on the memristor neural network unit；

Step 2.5: the input/output port of the first convolution+pond layer, the second convolution+pond layer and full articulamentum is connected It is connected on the I/O mouth of FPGA, obtains memristor speech recognition network to be trained.

Further, the point multiplication operation is carried out based on the memristor neural network unit, the arithmetic logic packet Include activation primitive, point multiplication operation unit is added in batches, feedback control training, signal input and output quantization, training pulse voltage control System.

Further, the memristor neural network unit be a × a arrangement memristor matrix array, wherein a be greater than Integer equal to 1.

Further, the memristor neural network unit is arranged by 9 memristors by 3 × 3 matrix forms.

Further, first convolution+pond layer, the second convolution+pond layer and full articulamentum described are recalled by multiple Resistance device neural network unit is arranged by p × q matrix form, and wherein p, q are the integer greater than 1 being not mutually equal.

Further, first convolution+pond layer and the second convolution+pond layer are by 234 memristor nerve nets Network unit is arranged by 9 × 26 matrix forms；The full articulamentum presses 100 by 400 memristor neural network units × 4 matrix forms arrange.

Further, described described in step 3 after training memristor speech recognition network inputs speech training signal FPGA carries out MFCC extraction and quantification treatment to speech training signal, is translated into an input for memristor neural network unit Signal.

Further, the number of the training is no less than 60 times.

Remarkable result of the invention is: first by self-made tool case, calculate and judge the memristor device being able to use, Then it using small-scale memristor array as basic array, carries out being arranged to make up big processing array in a certain way, build At the network for speech recognition, not only reduce on hardware to precision and preparation difficulty, the influence of time, hardware realization More advanced multilayer convolutional neural networks；And speech recognition and memristor are effectively combined, using connecting in batches Mode, the interference of hardware crosstalk is reduced, so that the accuracy of speech recognition greatly improves.

Detailed description of the invention

Fig. 1 is flow chart of the method for the present invention；

Fig. 2 is the recognition accuracy curve graph drawn；

Fig. 3 is the structural schematic diagram of memristor neural network unit；

Fig. 4 is convolution+pond layer memristor array arrangement schematic diagram；

Fig. 5 is the arrangement schematic diagram of full articulamentum memristor array；

Fig. 6 is identification error rate curves figure of the invention.

Specific embodiment

Specific embodiment and working principle of the present invention will be described in further detail with reference to the accompanying drawing.

As shown in Figure 1, a kind of audio recognition method based on memristor, the specific steps are as follows:

The acquisition process of the lowest accuracy standard is as follows:

Step 1.1: building speech recognition network and training with computer；

Step 1.3: recognition accuracy curve as shown in Figure 2 being drawn according to test result, and according to acceptable discrimination Lowest accuracy is chosen, as the minimum standard to device precision.Device precision by continuously adjust the mould of middle resistance fluctuation range Lai It indicates.

Preferably, the memristor neural network unit is the rectangle memristor array of 3 × 3 sizes, i.e., by 9 memristors It is arranged in an integrated chip by the matrix form of every row 3, each column 3 namely 9 memristor combination cmos and transistor Etc. be arranged in 3 rows 3 column crossbar on, wherein cross bar (row) receives voltage input i.e. V1, V2, V3, and vertical bar (column) provides electric current Output is U1, U2, U3, and manufactured memristor neural network unit is as shown in Figure 3；

Step 2: according to speech recognition network, carrying out FPGA (field-programmable gate array) and multiple memristor neural networks The connection of unit constructs memristor speech recognition network to be trained, specific steps are as follows:

Step 2.1: all arithmetic logics other than point multiplication operation in speech recognition network being determined on FPGA first Justice；Including activation primitive part, point multiplication operation unit, adding section, feedback control training part, signal input are defeated in batches Quantized segment, training pulse voltage control division point etc. out.

Step 2.2: 234 memristor neural network units being arranged by 9 × 26 matrix forms, according to shown in Fig. 4 Build the first convolution+pond layer；

Step 2.3: 234 memristor neural network units being arranged by 9 × 26 matrix forms, according to shown in Fig. 4 Build the second convolution+pond layer；

Step 2.4: 400 memristor neural network units being arranged by 100 × 4 matrix forms, according to shown in Fig. 5 Build full articulamentum；

When first convolution+pond layer, the second convolution+pond layer carry out convolution algorithm, by the corresponding input square of convolution kernel Battle array part is paved, and is input on nine arrays of each column by column, the corresponding convolution kernel of each column memristor of each column array, The U of each column lower end corresponds to the output of the column convolution kernel.When the full articulamentum carries out full connection operation, most next two columns are not It uses, therefore 100 row arrays correspond to 300 inputs of full articulamentum, corresponding 10 outputs connected entirely of 4 column arrays, lower end U correspond to the output of respective column.The input of every a line array is identical.

Step 3: input speech training signal to the memristor speech recognition network to be trained, the FPGA instruct voice Practice signal and carry out MFCC extraction and quantification treatment, be translated into the input signal of each memristor neural network unit, and by its Recognition result is back to the training module of FPGA；

When being tested, MFCC extraction and quantization are carried out to signal with the training module of FPGA, are converted into memristor module Input signal inputs memristor speech recognition network.In training mode, also copes with memristor speech recognition network to be trained and return The output signal returned is quantified again.

Wherein, after input signal is quantified, output characteristic spectrum calculation method is as follows:

It is in j-th of input feature vector spectrum of l layer, f (x) is activation equation, M_jIndicate input feature vector spectrum set,It is l The convolution kernel of layer corresponding i-th of input feature vector spectrum and j-th of output characteristic spectrum,J-th of output characteristic spectrum is corresponded at l layers Biasing.

The training module of step 4:FPGA, which is realized, calculates recognition result and speech training signal according to the recognition result of input The deviation of label, and trained memristor speech recognition network is treated according to the write-in rule of backpropagation function and memristor array It is trained, and the number of training is no less than 60 times, memristor speech recognition network needed for obtaining；

In this example, deviation calculating is carried out by the subtracter array that FPGA is programmed, then pass through setting backpropagation letter Number will need the weighted value changed to be converted into memristor and adjust the control voltage needed and adjust pulse.

In addition, being evaluated using mean square error equation training result in this programme, whether to determine training precision It is up to standard, then the memristor speech recognition network after training is trained again if it does not meet the standards, until precision is up to standard；

Mean square error calculation formula are as follows:

EⁿCorresponding n-th of sample,It is k-th of the output of n-th of sample,K-th of label of corresponding n-th of sample, k ∈ (1, c), c are the classification numbers that classification sum is also output.

Fixed required memristor speech recognition network after the completion of training, then voice signal to be measured is inputted to FPGA, it is identifying Under the calculating of network, voice signal text can be obtained, output mode can voluntarily be converted according to practical application.

In the present embodiment, application training number be 60 resulting speech recognition neural networks, for indicate ten kinds not Totally 2000 tested speech with instruction are identified, the identification error rate of acquisition as shown in fig. 6, it is not difficult to find out that, the present invention It is innovatively successfully realized while can make a choice as needed to device and precision with satisfactory accuracy The increasingly complex speech recognition neural network with good recognition accuracy.

Claims

1. a kind of audio recognition method based on memristor, which comprises the steps of:

Step 1: the recognition accuracy that speech recognition network emulates under different accuracy equipment is calculated, according to reasonable recognition accuracy Middle lowest accuracy standard carries out memristor device selection and memristor neural network unit is built；

Step 2: according to speech recognition network, carrying out the connection of FPGA and multiple memristor neural network units, construct wait train Memristor speech recognition network；

Step 3: input speech training signal to the memristor speech recognition network to be trained, and its recognition result is back to The training module of FPGA；

The training module of step 4:FPGA, which is realized, calculates recognition result and speech training signal label according to the recognition result of input Deviation, and trained memristor speech recognition network is treated according to the write-in rule of backpropagation function and memristor array and is carried out Training, memristor speech recognition network needed for obtaining；

Step 5: inputting voice signal to be identified into the FPGA of above-mentioned required memristor speech recognition network, obtain voice signal Content text；

The acquisition process of lowest accuracy standard described in step 1 is as follows:

Step 1.1: building speech recognition network and training with computer；

Step 1.3: recognition accuracy curve being drawn according to test result, and chooses the work of the lowest accuracy in reasonable recognition accuracy For the lowest accuracy standard of equipment；

The obtaining step of memristor speech recognition network to be trained described in step 2 are as follows:

Step 2.1: all arithmetic logics other than point multiplication operation in speech recognition network being defined on FPGA first；

Step 2.5: the input/output port of the first convolution+pond layer, the second convolution+pond layer and full articulamentum is connected to On the I/O mouth of FPGA, memristor speech recognition network to be trained is obtained.

2. the audio recognition method according to claim 1 based on memristor, it is characterised in that: the point multiplication operation is based on The memristor neural network unit carries out, and the arithmetic logic includes that activation primitive, point multiplication operation unit are added in batches, feed back Controlled training, signal input and output quantization, the control of training pulse voltage.

3. the audio recognition method according to claim 1 or 2 based on memristor, it is characterised in that: the memristor mind It is the memristor matrix array of a × a arrangement through network unit, wherein a is the integer more than or equal to 1.

4. the audio recognition method according to claim 1 based on memristor, it is characterised in that: the memristor nerve net Network unit is arranged by 9 memristors by 3 × 3 matrix forms.

5. the audio recognition method according to claim 1 based on memristor, it is characterised in that: first convolution+pond Change layer, the second convolution+pond layer and full articulamentum and is arranged by multiple memristor neural network units by p × q matrix form It arranges, wherein p, q are the integer greater than 1 being not mutually equal.

6. according to claim 1 or 5 based on the audio recognition method of memristor, it is characterised in that: first convolution+ Pond layer is arranged by 234 memristor neural network units by 9 × 26 matrix forms with the second convolution+pond layer； The full articulamentum is arranged by 400 memristor neural network units by 100 × 4 matrix forms.

7. the audio recognition method according to claim 1 based on memristor, it is characterised in that: wait instruct described in step 3 After practicing memristor speech recognition network inputs speech training signal, the FPGA carries out MFCC extraction and amount to speech training signal Change processing, is translated into the input signal of each memristor neural network unit.

8. the audio recognition method according to claim 1 based on memristor, it is characterised in that: the number of the training is not Less than 60 times.