CN104751227A  Method and system for constructing deep neural network  Google Patents
Method and system for constructing deep neural network Download PDFInfo
 Publication number
 CN104751227A CN104751227A CN201310755400.9A CN201310755400A CN104751227A CN 104751227 A CN104751227 A CN 104751227A CN 201310755400 A CN201310755400 A CN 201310755400A CN 104751227 A CN104751227 A CN 104751227A
 Authority
 CN
 China
 Prior art keywords
 hidden layer
 node number
 neural network
 deep neural
 successively
 Prior art date
 Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
 Granted
Links
 230000001537 neural Effects 0.000 title claims abstract description 101
 230000003247 decreasing Effects 0.000 claims abstract description 75
 238000010276 construction Methods 0.000 claims description 13
 230000001419 dependent Effects 0.000 claims description 7
 238000003860 storage Methods 0.000 abstract description 5
 238000000034 method Methods 0.000 description 10
 238000010586 diagram Methods 0.000 description 7
 230000000875 corresponding Effects 0.000 description 5
 230000000694 effects Effects 0.000 description 4
 238000005516 engineering process Methods 0.000 description 4
 238000009826 distribution Methods 0.000 description 3
 230000004069 differentiation Effects 0.000 description 2
 239000000203 mixture Substances 0.000 description 2
 238000003062 neural network model Methods 0.000 description 2
 230000001684 chronic Effects 0.000 description 1
 238000005520 cutting process Methods 0.000 description 1
 238000003066 decision tree Methods 0.000 description 1
 239000002360 explosive Substances 0.000 description 1
 238000005457 optimization Methods 0.000 description 1
 238000004088 simulation Methods 0.000 description 1
 238000001228 spectrum Methods 0.000 description 1
Abstract
The invention discloses a method and a system for constructing a deep neural network. The method comprises the steps of determining the number of nodes at a deep neural network input layer and the number of the nodes in an output layer; acquiring training data; determining the number of deep neural network hidden layers and the number of nodes of a first hidden layer; determining the number of the nodes at subsequent hidden layers according to the data quantity of the training data, the number of the hidden layers, and the number of nodes of the first hidden layer; enabling the number of the nodes of the hidden layer to be gradually reduced; determining the model parameters of the deep neural network according to the training data to obtain the deep neural network. Compared with the deep neural network in the prior art, the deep neural network constructed by the method and system has the advantages that the number of parameters of the neural network can be greatly decreased, the storage space required is reduced, and the training speed of the model is increased.
Description
Technical field
The present invention relates to signal transacting field, particularly relate to a kind of construction method and system of deep neural network.
Background technology
Namely speech recognition allows machine understand people's word, voice signal is converted into the discernible input of computing machine.Over nearly 20 years, speech recognition technology achieves remarkable effect, starts to move towards market from laboratory.At present based on the phonetic entry of speech recognition technology, speech retrieval, voiced translation etc. obtain uses widely.Along with the progress of science and technology, the explosive increase of information, the speech data that can obtain also gets more and more, and how to utilize the data of magnanimity to train a speech recognition system, and making phonetic recognization rate reach higher is a difficult problem in practical application.
Tradition Automatic continuous speech recognition system mainly adopts the GMMHMM speech recognition system based on Hidden Markov Model (HMM) (HiddenMarkov Model, HMM) and gauss hybrid models (Gaussian Mixture Model, GMM).GMMHMM speech recognition system uses HMM to carry out modeling to the sequential organization of voice signal, and the output probability of each HMM state adopts mixed Gauss model simulation.In recent years based on deep neural network (Deep Neural Networks, DNN) and the DNNHMM speech recognition system of Hidden Markov Model (HMM) be subject to researchist and more and more pay close attention to, DNNHMM system adopts DNN to substitute the output probability that GMM simulates each HMM state.Compared to GMM model, the descriptive power of DNN model is stronger, can simulate very complicated Data distribution8 better, and can learn the information of data context well, therefore relative to GMMHMM system, DNNHMM system can obtain significant performance boost.
Although but DNNHMM system has clear superiority in performance, still more difficult popularization in actual applications, main cause is that the model complexity of DNNHMM is higher, and when model training and decoding, required time is all far beyond GMMHMM system.Such as hidden layer number has 6 at least in DNN model under normal circumstances, and the node number of each hidden layer presets identical numerical value by system, as 2048 or 2560 nodes.The topological structure of obvious described model is comparatively complicated and model parameter is numerous, brings larger computing pressure, cause running efficiency of system excessively slow, be unfavorable for the popularization that system is practical and renewal to the model training on large database concept and subsequent voice decoding.
Summary of the invention
The object of the invention is to overcome deficiency of the prior art, a kind of construction method and system of deep neural network is provided, by effective control of the node number to each hidden layer, greatly reduce the redundancy of deep neural network interior joint.
For achieving the above object, technical scheme of the present invention is:
A construction method for deep neural network, comprising:
Determine the node number of deep neural network input layer and the node number of output layer;
Obtain training data;
Determine the number of plies of deep neural network hidden layer and the node number of ground floor hidden layer:
According to the node number of the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determine the node number of followup hidden layer, and the node number of different hidden layer is tapered off change;
The model parameter of described deep neural network obtains deep neural network to utilize described training data to determine.
Preferably, the described node number determining followup hidden layer, and the change that tapers off of the node number of different hidden layer is comprised:
According to data volume determination decreasing fashion and the ratio of successively decreasing of described training data;
According to the node number of the decreasing fashion determined and the followup hidden layer of the ratiodependent that successively decreases.
Preferably, the described data volume determination decreasing fashion according to described training data and ratio of successively decreasing comprise:
The node number of followup each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer.
Preferably, the described data volume determination decreasing fashion according to described training data and ratio of successively decreasing comprise:
The number of plies is less than or equal to each hidden layer of number of plies threshold value, the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer;
The number of plies is greater than to each hidden layer of described number of plies threshold value, the node number of wherein each oddlevel hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous oddlevel hidden layer, makes the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
Preferably, the ratio of successively decreasing of a kth hidden layer is: 1/p
^{k1}, wherein, 1≤p≤2.
A constructing system for deep neural network, comprising:
Input and output layer determining unit, for the node number of the node number and output layer of determining deep neural network input layer;
Data capture unit, for obtaining training data;
Hidden layer first determining unit, the node number for the number of plies and ground floor hidden layer of determining deep neural network hidden layer:
Hidden layer second determining unit, for the node number according to the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determines the node number of followup hidden layer, and the node number of different hidden layer is tapered off change;
Model parameter determining unit, determines that for utilizing described training data the model parameter of described deep neural network obtains deep neural network.
Preferably, described hidden layer second determining unit comprises:
Decline mode determining unit, for according to the data volume determination decreasing fashion of described training data and ratio of successively decreasing;
Nodes determining unit, for the node number according to the decreasing fashion determined and the followup hidden layer of the ratiodependent that successively decreases.
Preferably, described decline mode determining unit, specifically for determining that according to the data volume of described training data the node number of followup each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer.
Preferably, described decline mode determining unit, specifically for being less than or equal to each hidden layer of number of plies threshold value for the number of plies, determines that the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer; The number of plies is greater than to each hidden layer of described number of plies threshold value, determine that the node number of wherein each oddlevel hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous oddlevel hidden layer, make the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
Preferably, the ratio of successively decreasing of a kth hidden layer is: 1/p
^{k1}, wherein, 1≤p≤2.
Beneficial effect of the present invention is:
1. compared with the deep neural network identical with the node number of each hidden layer general at present, the deep neural network that application the present invention builds greatly can reduce the number of parameters of neural network, thus reduces required storage space and accelerate the training speed of model;
2. due to the time decreased of computing mode output probability when neural network parameter minimizing makes to decode, the deep neural network adopting the present invention to build is applied to speech recognition system, the decoding speed of the final identification of speech recognition system can be improved, thus in practice, have better realtime;
3. the present invention does not affect for the recognition performance that speech recognition system is final when neural network parameter significantly reduces, and under the constant prerequisite of neural network parameter, recognition performance can also be promoted by the method increased near the hidden layer node number of input layer.
Accompanying drawing explanation
In order to be illustrated more clearly in technical scheme of the invention process, be briefly described to the accompanying drawing used required in embodiment below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 shows the process flow diagram of the construction method of embodiment of the present invention deep neural network;
Fig. 2 shows in the embodiment of the present invention a kind of process flow diagram of the method for the node number determining followup hidden layer;
Fig. 3 shows in the embodiment of the present invention the another kind of process flow diagram of the method for the node number determining followup hidden layer;
Fig. 4 shows the structural representation of the constructing system of embodiment of the present invention deep neural network.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
In order to the scheme making those skilled in the art person understand the embodiment of the present invention better, first brief description is done to the training process of traditional DNN model below.
The training process of tradition DNN model comprises:
Step one: the topological structure determining DNN model;
Particularly, the input layer of DNN and output layer correspond respectively to the output state of acoustic feature and HMM model, and its node number can predetermine before training.And the node number of the number of plies of hidden layer and each hidden layer also rule of thumb presets usually, although different system empirical value has difference, but the implicit number of plies that in most cases can arrange DNN is between 4 to 9, the node number of each hidden layer is usually identical, is 1024,2048 or 2560.
Step two: deep neural network model parameter is trained;
Particularly, model parameter is weight parameter.Utilize the training data gathered to train the weight parameter of described DNN model, whole training process is divided into two steps:
A) without supervision pretraining
System first stochastic generation meets the initial weight of random number as neural network of Gaussian distribution, then only utilizes the acoustic feature of training data successively to carry out the training of weight according to the training method of limited Boltzmann machine from input layer toward output layer.Particularly, after first weight between input layer and first hidden layer has trained, acoustic feature and this weight is utilized to obtain the output valve of first hidden layer, the weight between first hidden layer and second hidden layer is trained in the input being regarded as limited Boltzmann machine, so repeat down, until the weight training between penultimate hidden layer and last hidden layer completes.
B) the final training of supervision is had
Weight nothing supervision pretraining obtained, as the initial weight of neural network, utilizes the acoustic feature of training data and the mark of correspondence, adopts error backpropagation algorithm to carry out the final optimization pass adjustment of all weights.Particularly, first calculating output and the legitimate reading of Current Situation of Neural Network according to present weight valuethe error value E between mark, then calculates the gradient of error value E to each layer weight
the last renewal carrying out each layer weight according to gradient descent method, namely
wherein W
_{i} ^{t}represent the present weight of ith layer, W
_{i} ^{t+1}represent the weight after ith layer of renewal.
The defect of tradition DNN model is: the method that the topological structure of traditional DNN model mainly adopts experience to arrange, to each hidden layer selection same node point number.Obviously such DNN model is comparatively large, and the model parameter of redundancy is more, causes the chronic and final decoding speed needed for model training very slow.
But, in the topological structure of the deep neural network being applied to speech recognition builds, therefore the hidden layer near input layer often needs to retain more node number to avoid the loss of Speech acoustics characteristic information owing to needing to retain the acoustic feature information extracted from speech waveform signal; And the hidden layer of close output layer, give up a lot for identifying inoperative or producing the information of interference compared to original acoustic feature, and the distinction information remained for identifying different conditions, therefore less node can be adopted in these layers to carry out modeling, thus the reduction of network parameter scale is realized when not losing recognition performance, realize the lifting of training effectiveness.And result of study confirms increasing along with the number of plies in deep neural network further, its weight distribution is sparse gradually, and general weight absolute value major part all will be less than 0.1, a lot of node in a network inoperative or effect very little.
To this, this case proposes a kind of deep neural network model regularity characteristic of meeting, the node number realizing hidden layer tapers off the construction method of deep neural network of change and system, by effective control of the node number to each hidden layer, greatly reduce the redundancy of deep neural network interior joint, the deep neural network adopting the present invention to build is applied to speech recognition system, when not losing final recognition performance, can effectively improve model training efficiency and the decoding speed of deep neural network.
As shown in Figure 1, be the process flow diagram of the construction method of embodiment of the present invention deep neural network, this construction method comprises the following steps:
Step 101: determine the node number of deep neural network input layer and the node number of output layer.
Particularly, the node number of deep neural network input layer is the dimension of the input acoustic feature of DNN model, particularly carry out the dimension of the rear acoustic feature of consecutive frame splicing, as taked adjacent 11 frames composition acoustic features and be the spectrum signature of 43 dimensions before the splicing of every frame, then the node number of deep neural network input layer is 11*43=473.
The node number of deep neural network output layer decides the distinction of DNN, and its node is for simulating the distribution of HMM state, and to this, most plain mode can be set to the corresponding state of an output layer node.But because in HMM model, status number is too much, the setup mode of a single state output node easily causes DNN scale too huge, so often first classified to HMM state by the method for cluster in actual applications, and make the corresponding state subclass of each node of output layer.Each node of usual output layer corresponds to each state after decision tree in GMMHMM is bound, what such as adopted before training DNN is the cutting that the GMMHMM of 3000 states carries out training data mark, then the node number of the output layer of DNN is 3000, and is one to one with 3000 states of this GMMHMM.
Step 102: obtain training data.
Step 103: determine the number of plies of deep neural network hidden layer and the node number of ground floor hidden layer.
Particularly, the number of plies of deep neural network hidden layer sets according to the attribute of obtained training data, and wherein, the attribute of training data comprises following several: have large vocabulary, continuous print, discrete etc.In a preferred embodiment of the invention, for there is large vocabulary, number of plies L=6 that hidden layer is set continuous print training data usually, for the number of plies L=3 usually arranging hidden layer discrete training data.
The node number of the ground floor hidden layer of deep neural network sets according to the data volume of obtained training data, and wherein, the data volume of training data refers to the hourage of training data.In a preferred embodiment of the invention, for the training data of data volume more than 100 hours, the node number N of ground floor hidden layer is set usually
_{1}=2048, data volume is less than to the training data of 100 hours, the node number N of ground floor hidden layer is set usually
_{1}=1024.
Step 104: according to the node number of the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determines the node number of followup hidden layer, and the node number of different hidden layer is tapered off change.
Particularly, can according to the data volume determination decreasing fashion of described training data and ratio of successively decreasing, then according to the node number of the decreasing fashion determined and the followup hidden layer of the ratiodependent that successively decreases.
Step 105: the model parameter of described deep neural network obtains deep neural network to utilize described training data to determine.
Particularly, model parameter is here the weight parameter of deep neural network, consistent with defining method of the prior art to the defining method of the model parameter of deep neural network according to training data, does not repeat them here.
Below to the node number of the followup hidden layer of determination that abovementioned steps 104 relates to, and make the node number of different hidden layer taper off change method be described in detail.
As shown in Figure 2, be a kind of process flow diagram of method of the node number determining followup hidden layer in the embodiment of the present invention.
In general, the data volume of training data is larger, needs the node number retaining more hidden layer, otherwise then only needs the node number of less hidden layer.Based on this, in this embodiment, when determining the node number of followup hidden layer, the node number of different hidden layer can be made to taper off change, specifically can comprise the following steps:
Step 201: according to
determine the ratio p that successively decreases, wherein H is the data volume of training data, and the value of ratio of successively decreasing meets 1≤p≤2.When the data volume of training data is less, the node number of the hidden layer that we need is less, therefore ratio p is successively decreased close to 2, when the data volume of training data is increased to 10000 constantly little gradually, the ratio p that successively decreases drops to 1 gradually from 2, and when the data volume of training data is constantly little more than 10000, the ratio p that successively decreases remains on 1, namely the node number of each hidden layer is identical, then current deep neural network be traditional hidden layer node number identical, the feedforward type deep neural network of standard.
Step 202: according to
determine the node number N of followup hidden layer
_{k}, make for described followup hidden layer, the node number N of each hidden layer
_{k}at the node number N of described ground floor hidden layer
_{1}basis on successively successively decrease according to the described ratio p that successively decreases.
As shown in Figure 3, be the another kind of process flow diagram of method of the node number determining followup hidden layer.
The method of the node number of the followup hidden layer of the determination in the present embodiment is a kind of method for the attribute of training data and the differentiation of data volume: for discrete or that data volume is less training data, the node number of required hidden layer is less, and the node number of hidden layer promptly can reduce and not affect recognition performance; Otherwise for continuous print or the larger training data of data volume, the node number of required hidden layer is more, and in order to ensure that the node number of recognition performance hidden layer can not reduce too fast.Because the number of plies of hidden layer is determined by the attribute of training data, so the method for the node number of the followup hidden layer of determination in the present embodiment is a kind of method for the number of plies of hidden layer and the differentiation of data volume.
In this embodiment, when determining the node number of followup hidden layer, the number of plies is less than or equal to each hidden layer of number of plies threshold value, the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer; The number of plies is greater than to each hidden layer of described number of plies threshold value, the node number of wherein each oddlevel hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous oddlevel hidden layer, makes the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.Specifically can comprise the following steps:
Step 301: according to
determine the ratio p that successively decreases, wherein H is the data volume of training data, and the value of ratio of successively decreasing meets 1≤p≤2.When the data volume of training data is less, the node number of the hidden layer that we need is less, therefore ratio p is successively decreased close to 2, when the data volume of training data is increased to 10000 constantly little gradually, the ratio p that successively decreases drops to 1 gradually from 2, and when the data volume of training data is constantly little more than 10000, the ratio p that successively decreases remains on 1, namely the node number of each hidden layer is identical, then current deep neural network be traditional hidden layer node number identical, the feedforward type deep neural network of standard.
Step 302: judge whether the number of plies L of described hidden layer is less than or equal to number of plies threshold value L
_{0}.
Step 303: if the number of plies L of described hidden layer is less than or equal to number of plies threshold value L
_{0}, then basis
determine the node number N of followup hidden layer
_{k}, make for described followup hidden layer, the node number N of each hidden layer
_{k}at the node number N of described ground floor hidden layer
_{1}basis on successively successively decrease according to the described ratio p that successively decreases.
Step 304: if the number of plies L of described hidden layer is greater than number of plies threshold value L
_{0}, then according to N
_{2m1}* p
^{m1}=N
_{1}, N
_{2m}=N
_{2m1}determine the node number of followup hidden layer, make for described followup hidden layer, the node number N of each oddlevel hidden layer
_{2m1}successively decrease according to the described ratio p that successively decreases in the basis of the node number of its previous oddlevel hidden layer; The node number N of each even level hidden layer
_{2m}equal the node number of its previous hidden layer.
Illustrate abovementioned two kinds of methods determining the node number of followup hidden layer below: for data volume more than 10000 hours, there is large vocabulary, continuous print training data, the number of plies L=6 of hidden layer can be obtained, the node number N of ground floor hidden layer according to scheme above
_{1}=2048, successively decrease ratio p=1, and thus, the node number of 6 layers of hidden layer of corresponding deep neural network is respectively 2048,2048,2048,2048,2048,2048.For data volume be less than 100 hours, large vocabulary, continuous print training data, the number of plies L=6 of hidden layer can be obtained according to scheme above, the node number N of ground floor hidden layer
_{1}=1024, successively decrease ratio p=2, and the number of plies L due to hidden layer is greater than default number of plies threshold value L
_{0}=3, so the node number that the method according to Fig. 3 can obtain 6 layers of hidden layer of corresponding deep neural network is respectively 1024,1024,512,512,256,256.100 hours, discrete training data is less than for data volume, the number of plies L=3 of hidden layer can be obtained according to scheme above, the node number N of ground floor hidden layer
_{1}=1024, successively decrease ratio p=2, and the number of plies L due to hidden layer equals default number of plies threshold value L
_{0}=3, so the node number that the method according to Fig. 3 can obtain 3 layers of hidden layer of corresponding deep neural network is respectively 1024,512,256.Compared to the neural network that the node number of each hidden layer general is at present 2048, the present invention, while a large amount of reduction neural network parameter, can not lose the recognition performance of the deep neural network being applied to speech recognition.It is worth mentioning that the node number N making ground floor hidden layer
_{1}=3072, when ensureing that neural network population parameter scale is identical, final recognition performance can also be improved.
To sum up, compared with the deep neural network identical with the node number of each hidden layer general at present, the deep neural network that application the present invention builds greatly can reduce the number of parameters of neural network, thus reduces required storage space and accelerate the training speed of model.The particularly DNNHMM system of the binding state of the speech recognition system use of large vocabulary at present, node number due to output layer can reach 10,000 even more, and the node number reducing last hidden layer effectively can reduce the number of parameters of neural network.In addition, due to the time decreased of computing mode output probability when network parameter minimizing makes to decode, the deep neural network adopting the present invention to build is applied to speech recognition system, the final decoding speed identified can be improved, thus in practice, have better realtime.Have again, method of the present invention does not affect for the recognition performance that speech recognition system is final when network parameter significantly reduces, and under the constant prerequisite of network parameter, recognition performance can also be promoted by the method increased near the hidden layer node number of input layer.Correspondingly, the embodiment of the present invention also provides a kind of constructing system of deep neural network, as shown in Figure 4, is a kind of structural representation of this system.
In this embodiment, the constructing system of described deep neural network comprises:
Input and output layer determining unit 401, for the node number of the node number and output layer of determining deep neural network input layer;
Data capture unit 402, for obtaining training data;
Hidden layer first determining unit 403, the node number for the number of plies and ground floor hidden layer of determining deep neural network hidden layer:
Hidden layer second determining unit 404, for the node number according to the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determines the node number of followup hidden layer, and the node number of different hidden layer is tapered off change;
Model parameter determining unit 405, determines that for utilizing described training data the model parameter of described deep neural network obtains deep neural network.
In embodiments of the present invention, a kind of concrete structure of described hidden layer second determining unit can comprise: decline mode determining unit and nodes determining unit, wherein:
Described decline mode determining unit, for according to the data volume determination decreasing fashion of described training data and ratio of successively decreasing;
Described nodes determining unit, for the node number according to the decreasing fashion determined and the followup hidden layer of the ratiodependent that successively decreases.
In actual applications, abovementioned decline mode determining unit can adopt the decreasing fashion of various ways determination hidden layer node and ratio of successively decreasing.Such as, determine that the node number of followup each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer according to the data volume of described training data.For another example, the number of plies is less than or equal to each hidden layer of number of plies threshold value, determines that the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer; The number of plies is greater than to each hidden layer of described number of plies threshold value, determine that the node number of wherein each oddlevel hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous oddlevel hidden layer, make the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
Correspondingly, the node number of the decreasing fashion that abovementioned nodes determining unit can be determined according to decline mode determining unit and the followup hidden layer of the ratiodependent that successively decreases, thus greatly reduce the number of parameters of neural network, thus reduce required storage space and accelerate the training speed of model.
Compared with the deep neural network identical with the node number of each hidden layer general at present, the deep neural network of system constructing of the application embodiment of the present invention greatly can reduce the number of parameters of neural network, thus reduces required storage space and accelerate the training speed of model.The particularly DNNHMM system of the binding state of the speech recognition system use of large vocabulary at present, node number due to output layer can reach 10,000 even more, and the node number reducing last hidden layer effectively can reduce the number of parameters of neural network.In addition, due to the time decreased of computing mode output probability when network parameter minimizing makes to decode, the deep neural network adopting the present invention to build is applied to speech recognition system, the final decoding speed identified can be improved, thus in practice, have better realtime.Have again, method of the present invention does not affect for the recognition performance that speech recognition system is final when network parameter significantly reduces, and under the constant prerequisite of network parameter, recognition performance can also be promoted by the method increased near the hidden layer node number of input layer.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for system embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.System embodiment described above is only schematic, and the wherein said unit that illustrates as separating component and module can or may not be physically separates.In addition, some or all of unit wherein and module can also be selected according to the actual needs to realize the object of the present embodiment scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.
Structure of the present invention, feature and action effect is described in detail above according to graphic shown embodiment; the foregoing is only preferred embodiment of the present invention; but the present invention does not limit practical range with shown in drawing; every change done according to conception of the present invention; or be revised as the Equivalent embodiments of equivalent variations; do not exceed yet instructions with diagram contain spiritual time, all should in protection scope of the present invention.
Claims (10)
1. a construction method for deep neural network, is characterized in that, comprising:
Determine the node number of deep neural network input layer and the node number of output layer;
Obtain training data;
Determine the number of plies of deep neural network hidden layer and the node number of ground floor hidden layer:
According to the node number of the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determine the node number of followup hidden layer, and the node number of different hidden layer is tapered off change;
The model parameter of described deep neural network obtains deep neural network to utilize described training data to determine.
2. the construction method of deep neural network according to claim 1, is characterized in that, the described node number determining followup hidden layer, and the change that tapers off of the node number of different hidden layer is comprised:
According to data volume determination decreasing fashion and the ratio of successively decreasing of described training data;
According to the node number of the decreasing fashion determined and the followup hidden layer of the ratiodependent that successively decreases.
3. the construction method of deep neural network according to claim 2, is characterized in that, the described data volume determination decreasing fashion according to described training data and ratio of successively decreasing comprise:
The node number of followup each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer.
4. the construction method of deep neural network according to claim 2, is characterized in that, the described data volume determination decreasing fashion according to described training data and ratio of successively decreasing comprise:
The number of plies is less than or equal to each hidden layer of number of plies threshold value, the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer;
The number of plies is greater than to each hidden layer of described number of plies threshold value, the node number of wherein each oddlevel hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous oddlevel hidden layer, makes the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
5. the construction method of the deep neural network according to Claims 2 or 3 or 4, is characterized in that, the ratio of successively decreasing of a kth hidden layer is: 1/p
^{k1}, wherein, 1≤p≤2.
6. a constructing system for deep neural network, is characterized in that, comprising:
Input and output layer determining unit, for the node number of the node number and output layer of determining deep neural network input layer;
Data capture unit, for obtaining training data;
Hidden layer first determining unit, the node number for the number of plies and ground floor hidden layer of determining deep neural network hidden layer:
Hidden layer second determining unit, for the node number according to the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determines the node number of followup hidden layer, and the node number of different hidden layer is tapered off change;
Model parameter determining unit, determines that for utilizing described training data the model parameter of described deep neural network obtains deep neural network.
7. the constructing system of deep neural network according to claim 6, is characterized in that, described hidden layer second determining unit comprises:
Decline mode determining unit, for according to the data volume determination decreasing fashion of described training data and ratio of successively decreasing;
Nodes determining unit, for the node number according to the decreasing fashion determined and the followup hidden layer of the ratiodependent that successively decreases.
8. the constructing system of deep neural network according to claim 7, is characterized in that,
Described decline mode determining unit, specifically for determining that according to the data volume of described training data the node number of followup each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer.
9. the constructing system of deep neural network according to claim 7, it is characterized in that, described decline mode determining unit, specifically for being less than or equal to each hidden layer of number of plies threshold value for the number of plies, determine that the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer; The number of plies is greater than to each hidden layer of described number of plies threshold value, determine that the node number of wherein each oddlevel hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous oddlevel hidden layer, make the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
10. the constructing system of the deep neural network according to claim 7 or 8 or 9, is characterized in that, the ratio of successively decreasing of a kth hidden layer is: 1/p
^{k1}, wherein, 1≤p≤2.
Priority Applications (1)
Application Number  Priority Date  Filing Date  Title 

CN201310755400.9A CN104751227B (en)  20131231  20131231  Construction method and system for the deep neural network of speech recognition 
Applications Claiming Priority (1)
Application Number  Priority Date  Filing Date  Title 

CN201310755400.9A CN104751227B (en)  20131231  20131231  Construction method and system for the deep neural network of speech recognition 
Publications (2)
Publication Number  Publication Date 

CN104751227A true CN104751227A (en)  20150701 
CN104751227B CN104751227B (en)  20180306 
Family
ID=53590872
Family Applications (1)
Application Number  Title  Priority Date  Filing Date 

CN201310755400.9A Active CN104751227B (en)  20131231  20131231  Construction method and system for the deep neural network of speech recognition 
Country Status (1)
Country  Link 

CN (1)  CN104751227B (en) 
Cited By (9)
Publication number  Priority date  Publication date  Assignee  Title 

CN106096727A (en) *  20160602  20161109  腾讯科技（深圳）有限公司  A kind of network model based on machine learning building method and device 
WO2017076211A1 (en) *  20151105  20170511  阿里巴巴集团控股有限公司  Voicebased role separation method and device 
CN106898354A (en) *  20170303  20170627  清华大学  Speaker number estimation method based on DNN models and supporting vector machine model 
CN108122035A (en) *  20161129  20180605  科大讯飞股份有限公司  Endtoend modeling method and system 
CN108648769A (en) *  20180420  20181012  百度在线网络技术（北京）有限公司  Voice activity detection method, apparatus and equipment 
CN109034372A (en) *  20180628  20181218  浙江大学  A kind of neural networks pruning method based on probability 
CN109102067A (en) *  20180713  20181228  厦门快商通信息技术有限公司  The method of increase and decrease certainly, computer equipment and the storage medium of neural network node 
CN109154798A (en) *  20160509  20190104  1Qb信息技术公司  For improving the method and system of the strategy of Stochastic Control Problem 
CN109697506A (en) *  20171020  20190430  图核有限公司  Processing in neural network 
Citations (6)
Publication number  Priority date  Publication date  Assignee  Title 

US20040076944A1 (en) *  20020822  20040422  Ibex Process Technology, Inc.  Supervised learning in the presence of null data 
US7756646B2 (en) *  20060331  20100713  Battelle Memorial Institute  Method for predicting peptide detection in mass spectrometry 
CN102411931A (en) *  20100915  20120411  微软公司  Deep belief network for large vocabulary continuous speech recognition 
CN102496059A (en) *  20111125  20120613  中冶集团武汉勘察研究院有限公司  Mine shaft well engineering surrounding rock artificial intelligence stage division method 
CN103049792A (en) *  20111126  20130417  微软公司  Discriminative pretraining of Deep Neural Network 
CN103400577A (en) *  20130801  20131120  百度在线网络技术（北京）有限公司  Acoustic model building method and device for multilanguage voice identification 

2013
 20131231 CN CN201310755400.9A patent/CN104751227B/en active Active
Patent Citations (6)
Publication number  Priority date  Publication date  Assignee  Title 

US20040076944A1 (en) *  20020822  20040422  Ibex Process Technology, Inc.  Supervised learning in the presence of null data 
US7756646B2 (en) *  20060331  20100713  Battelle Memorial Institute  Method for predicting peptide detection in mass spectrometry 
CN102411931A (en) *  20100915  20120411  微软公司  Deep belief network for large vocabulary continuous speech recognition 
CN102496059A (en) *  20111125  20120613  中冶集团武汉勘察研究院有限公司  Mine shaft well engineering surrounding rock artificial intelligence stage division method 
CN103049792A (en) *  20111126  20130417  微软公司  Discriminative pretraining of Deep Neural Network 
CN103400577A (en) *  20130801  20131120  百度在线网络技术（北京）有限公司  Acoustic model building method and device for multilanguage voice identification 
NonPatent Citations (2)
Title 

刘维群等: "BP网络中隐含层节点优化的研究", 《交通与计算机》 * 
高大文等: "人工神经网络中隐含层节点与训练次数的优化", 《哈尔滨工业大学学报》 * 
Cited By (16)
Publication number  Priority date  Publication date  Assignee  Title 

WO2017076211A1 (en) *  20151105  20170511  阿里巴巴集团控股有限公司  Voicebased role separation method and device 
CN109154798A (en) *  20160509  20190104  1Qb信息技术公司  For improving the method and system of the strategy of Stochastic Control Problem 
CN109154798B (en) *  20160509  20220225  1Qb信息技术公司  Method and system for improving strategies for stochastic control problems 
WO2017206936A1 (en) *  20160602  20171207  腾讯科技（深圳）有限公司  Machine learning based network model construction method and apparatus 
CN106096727A (en) *  20160602  20161109  腾讯科技（深圳）有限公司  A kind of network model based on machine learning building method and device 
CN106096727B (en) *  20160602  20181207  腾讯科技（深圳）有限公司  A kind of network model building method and device based on machine learning 
CN108122035A (en) *  20161129  20180605  科大讯飞股份有限公司  Endtoend modeling method and system 
CN108122035B (en) *  20161129  20191018  科大讯飞股份有限公司  Endtoend modeling method and system 
CN106898354B (en) *  20170303  20200519  北京华控智加科技有限公司  Method for estimating number of speakers based on DNN model and support vector machine model 
CN106898354A (en) *  20170303  20170627  清华大学  Speaker number estimation method based on DNN models and supporting vector machine model 
CN109697506A (en) *  20171020  20190430  图核有限公司  Processing in neural network 
CN108648769A (en) *  20180420  20181012  百度在线网络技术（北京）有限公司  Voice activity detection method, apparatus and equipment 
CN109034372A (en) *  20180628  20181218  浙江大学  A kind of neural networks pruning method based on probability 
CN109034372B (en) *  20180628  20201016  浙江大学  Neural network pruning method based on probability 
CN109102067A (en) *  20180713  20181228  厦门快商通信息技术有限公司  The method of increase and decrease certainly, computer equipment and the storage medium of neural network node 
CN109102067B (en) *  20180713  20210806  厦门快商通信息技术有限公司  Method for increasing and decreasing neural network nodes, computer device and storage medium 
Also Published As
Publication number  Publication date 

CN104751227B (en)  20180306 
Similar Documents
Publication  Publication Date  Title 

CN104751227A (en)  Method and system for constructing deep neural network  
CN104751228B (en)  Construction method and system for the deep neural network of speech recognition  
CN104143327B (en)  A kind of acoustic training model method and apparatus  
CN104751842B (en)  The optimization method and system of deep neural network  
CN104538028B (en)  A kind of continuous speech recognition method that Recognition with Recurrent Neural Network is remembered based on depth shot and long term  
CN103400577B (en)  The acoustic model method for building up of multilingual speech recognition and device  
CN105427869A (en)  Session emotion autoanalysis method based on depth learning  
CN105139864B (en)  Audio recognition method and device  
CN107785015A (en)  A kind of audio recognition method and device  
CN104036774A (en)  Method and system for recognizing Tibetan dialects  
CN104700828A (en)  Deep longterm and shortterm memory recurrent neural network acoustic model establishing method based on selective attention principles  
CN103117060A (en)  Modeling approach and modeling system of acoustic model used in speech recognition  
CN104952448A (en)  Method and system for enhancing features by aid of bidirectional longterm and shortterm memory recurrent neural networks  
CN108281137A (en)  A kind of universal phonetic under whole tone element frame wakes up recognition methods and system  
CN105096941A (en)  Voice recognition method and device  
CN106104674A (en)  Mixing voice identification  
CN107393542A (en)  A kind of birds species identification method based on binary channels neutral net  
CN109119072A (en)  Civil aviaton's land sky call acoustic model construction method based on DNNHMM  
CN108197294A (en)  A kind of text automatic generation method based on deep learning  
CN107767861A (en)  voice awakening method, system and intelligent terminal  
CN102779510A (en)  Speech emotion recognition method based on feature space selfadaptive projection  
CN106919977A (en)  A kind of feedforward sequence Memory Neural Networks and its construction method and system  
CN110070855A (en)  A kind of speech recognition system and method based on migration neural network acoustic model  
CN105279552A (en)  Character based neural network training method and device  
CN102945673A (en)  Continuous speech recognition method with speech command range changed dynamically 
Legal Events
Date  Code  Title  Description 

C06  Publication  
PB01  Publication  
C10  Entry into substantive examination  
SE01  Entry into force of request for substantive examination  
CB02  Change of applicant information 
Address after: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666 Applicant after: Iflytek Co., Ltd. Address before: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666 Applicant before: Anhui USTC iFLYTEK Co., Ltd. 

COR  Change of bibliographic data  
GR01  Patent grant  
GR01  Patent grant 