CN104751227A - Method and system for constructing deep neural network - Google Patents
Method and system for constructing deep neural network Download PDFInfo
- Publication number
- CN104751227A CN104751227A CN201310755400.9A CN201310755400A CN104751227A CN 104751227 A CN104751227 A CN 104751227A CN 201310755400 A CN201310755400 A CN 201310755400A CN 104751227 A CN104751227 A CN 104751227A
- Authority
- CN
- China
- Prior art keywords
- hidden layer
- node number
- neural network
- deep neural
- successively
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a method and a system for constructing a deep neural network. The method comprises the steps of determining the number of nodes at a deep neural network input layer and the number of the nodes in an output layer; acquiring training data; determining the number of deep neural network hidden layers and the number of nodes of a first hidden layer; determining the number of the nodes at subsequent hidden layers according to the data quantity of the training data, the number of the hidden layers, and the number of nodes of the first hidden layer; enabling the number of the nodes of the hidden layer to be gradually reduced; determining the model parameters of the deep neural network according to the training data to obtain the deep neural network. Compared with the deep neural network in the prior art, the deep neural network constructed by the method and system has the advantages that the number of parameters of the neural network can be greatly decreased, the storage space required is reduced, and the training speed of the model is increased.
Description
Technical field
The present invention relates to signal transacting field, particularly relate to a kind of construction method and system of deep neural network.
Background technology
Namely speech recognition allows machine understand people's word, voice signal is converted into the discernible input of computing machine.Over nearly 20 years, speech recognition technology achieves remarkable effect, starts to move towards market from laboratory.At present based on the phonetic entry of speech recognition technology, speech retrieval, voiced translation etc. obtain uses widely.Along with the progress of science and technology, the explosive increase of information, the speech data that can obtain also gets more and more, and how to utilize the data of magnanimity to train a speech recognition system, and making phonetic recognization rate reach higher is a difficult problem in practical application.
Tradition Automatic continuous speech recognition system mainly adopts the GMM-HMM speech recognition system based on Hidden Markov Model (HMM) (HiddenMarkov Model, HMM) and gauss hybrid models (Gaussian Mixture Model, GMM).GMM-HMM speech recognition system uses HMM to carry out modeling to the sequential organization of voice signal, and the output probability of each HMM state adopts mixed Gauss model simulation.In recent years based on deep neural network (Deep Neural Networks, DNN) and the DNN-HMM speech recognition system of Hidden Markov Model (HMM) be subject to researchist and more and more pay close attention to, DNN-HMM system adopts DNN to substitute the output probability that GMM simulates each HMM state.Compared to GMM model, the descriptive power of DNN model is stronger, can simulate very complicated Data distribution8 better, and can learn the information of data context well, therefore relative to GMM-HMM system, DNN-HMM system can obtain significant performance boost.
Although but DNN-HMM system has clear superiority in performance, still more difficult popularization in actual applications, main cause is that the model complexity of DNN-HMM is higher, and when model training and decoding, required time is all far beyond GMM-HMM system.Such as hidden layer number has 6 at least in DNN model under normal circumstances, and the node number of each hidden layer presets identical numerical value by system, as 2048 or 2560 nodes.The topological structure of obvious described model is comparatively complicated and model parameter is numerous, brings larger computing pressure, cause running efficiency of system excessively slow, be unfavorable for the popularization that system is practical and renewal to the model training on large database concept and subsequent voice decoding.
Summary of the invention
The object of the invention is to overcome deficiency of the prior art, a kind of construction method and system of deep neural network is provided, by effective control of the node number to each hidden layer, greatly reduce the redundancy of deep neural network interior joint.
For achieving the above object, technical scheme of the present invention is:
A construction method for deep neural network, comprising:
Determine the node number of deep neural network input layer and the node number of output layer;
Obtain training data;
Determine the number of plies of deep neural network hidden layer and the node number of ground floor hidden layer:
According to the node number of the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determine the node number of follow-up hidden layer, and the node number of different hidden layer is tapered off change;
The model parameter of described deep neural network obtains deep neural network to utilize described training data to determine.
Preferably, the described node number determining follow-up hidden layer, and the change that tapers off of the node number of different hidden layer is comprised:
According to data volume determination decreasing fashion and the ratio of successively decreasing of described training data;
According to the node number of the decreasing fashion determined and the follow-up hidden layer of the ratio-dependent that successively decreases.
Preferably, the described data volume determination decreasing fashion according to described training data and ratio of successively decreasing comprise:
The node number of follow-up each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer.
Preferably, the described data volume determination decreasing fashion according to described training data and ratio of successively decreasing comprise:
The number of plies is less than or equal to each hidden layer of number of plies threshold value, the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer;
The number of plies is greater than to each hidden layer of described number of plies threshold value, the node number of wherein each odd-level hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous odd-level hidden layer, makes the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
Preferably, the ratio of successively decreasing of a kth hidden layer is: 1/p
k-1, wherein, 1≤p≤2.
A constructing system for deep neural network, comprising:
Input and output layer determining unit, for the node number of the node number and output layer of determining deep neural network input layer;
Data capture unit, for obtaining training data;
Hidden layer first determining unit, the node number for the number of plies and ground floor hidden layer of determining deep neural network hidden layer:
Hidden layer second determining unit, for the node number according to the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determines the node number of follow-up hidden layer, and the node number of different hidden layer is tapered off change;
Model parameter determining unit, determines that for utilizing described training data the model parameter of described deep neural network obtains deep neural network.
Preferably, described hidden layer second determining unit comprises:
Decline mode determining unit, for according to the data volume determination decreasing fashion of described training data and ratio of successively decreasing;
Nodes determining unit, for the node number according to the decreasing fashion determined and the follow-up hidden layer of the ratio-dependent that successively decreases.
Preferably, described decline mode determining unit, specifically for determining that according to the data volume of described training data the node number of follow-up each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer.
Preferably, described decline mode determining unit, specifically for being less than or equal to each hidden layer of number of plies threshold value for the number of plies, determines that the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer; The number of plies is greater than to each hidden layer of described number of plies threshold value, determine that the node number of wherein each odd-level hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous odd-level hidden layer, make the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
Preferably, the ratio of successively decreasing of a kth hidden layer is: 1/p
k-1, wherein, 1≤p≤2.
Beneficial effect of the present invention is:
1. compared with the deep neural network identical with the node number of each hidden layer general at present, the deep neural network that application the present invention builds greatly can reduce the number of parameters of neural network, thus reduces required storage space and accelerate the training speed of model;
2. due to the time decreased of computing mode output probability when neural network parameter minimizing makes to decode, the deep neural network adopting the present invention to build is applied to speech recognition system, the decoding speed of the final identification of speech recognition system can be improved, thus in practice, have better real-time;
3. the present invention does not affect for the recognition performance that speech recognition system is final when neural network parameter significantly reduces, and under the constant prerequisite of neural network parameter, recognition performance can also be promoted by the method increased near the hidden layer node number of input layer.
Accompanying drawing explanation
In order to be illustrated more clearly in technical scheme of the invention process, be briefly described to the accompanying drawing used required in embodiment below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 shows the process flow diagram of the construction method of embodiment of the present invention deep neural network;
Fig. 2 shows in the embodiment of the present invention a kind of process flow diagram of the method for the node number determining follow-up hidden layer;
Fig. 3 shows in the embodiment of the present invention the another kind of process flow diagram of the method for the node number determining follow-up hidden layer;
Fig. 4 shows the structural representation of the constructing system of embodiment of the present invention deep neural network.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
In order to the scheme making those skilled in the art person understand the embodiment of the present invention better, first brief description is done to the training process of traditional DNN model below.
The training process of tradition DNN model comprises:
Step one: the topological structure determining DNN model;
Particularly, the input layer of DNN and output layer correspond respectively to the output state of acoustic feature and HMM model, and its node number can pre-determine before training.And the node number of the number of plies of hidden layer and each hidden layer also rule of thumb presets usually, although different system empirical value has difference, but the implicit number of plies that in most cases can arrange DNN is between 4 to 9, the node number of each hidden layer is usually identical, is 1024,2048 or 2560.
Step two: deep neural network model parameter is trained;
Particularly, model parameter is weight parameter.Utilize the training data gathered to train the weight parameter of described DNN model, whole training process is divided into two steps:
A) without supervision pre-training
System first stochastic generation meets the initial weight of random number as neural network of Gaussian distribution, then only utilizes the acoustic feature of training data successively to carry out the training of weight according to the training method of limited Boltzmann machine from input layer toward output layer.Particularly, after first weight between input layer and first hidden layer has trained, acoustic feature and this weight is utilized to obtain the output valve of first hidden layer, the weight between first hidden layer and second hidden layer is trained in the input being regarded as limited Boltzmann machine, so repeat down, until the weight training between penultimate hidden layer and last hidden layer completes.
B) the final training of supervision is had
Weight nothing supervision pre-training obtained, as the initial weight of neural network, utilizes the acoustic feature of training data and the mark of correspondence, adopts error backpropagation algorithm to carry out the final optimization pass adjustment of all weights.Particularly, first calculating output and the legitimate reading of Current Situation of Neural Network according to present weight value---the error value E between mark, then calculates the gradient of error value E to each layer weight
the last renewal carrying out each layer weight according to gradient descent method, namely
wherein W
i trepresent the present weight of i-th layer, W
i t+1represent the weight after i-th layer of renewal.
The defect of tradition DNN model is: the method that the topological structure of traditional DNN model mainly adopts experience to arrange, to each hidden layer selection same node point number.Obviously such DNN model is comparatively large, and the model parameter of redundancy is more, causes the chronic and final decoding speed needed for model training very slow.
But, in the topological structure of the deep neural network being applied to speech recognition builds, therefore the hidden layer near input layer often needs to retain more node number to avoid the loss of Speech acoustics characteristic information owing to needing to retain the acoustic feature information extracted from speech waveform signal; And the hidden layer of close output layer, give up a lot for identifying inoperative or producing the information of interference compared to original acoustic feature, and the distinction information remained for identifying different conditions, therefore less node can be adopted in these layers to carry out modeling, thus the reduction of network parameter scale is realized when not losing recognition performance, realize the lifting of training effectiveness.And result of study confirms increasing along with the number of plies in deep neural network further, its weight distribution is sparse gradually, and general weight absolute value major part all will be less than 0.1, a lot of node in a network inoperative or effect very little.
To this, this case proposes a kind of deep neural network model regularity characteristic of meeting, the node number realizing hidden layer tapers off the construction method of deep neural network of change and system, by effective control of the node number to each hidden layer, greatly reduce the redundancy of deep neural network interior joint, the deep neural network adopting the present invention to build is applied to speech recognition system, when not losing final recognition performance, can effectively improve model training efficiency and the decoding speed of deep neural network.
As shown in Figure 1, be the process flow diagram of the construction method of embodiment of the present invention deep neural network, this construction method comprises the following steps:
Step 101: determine the node number of deep neural network input layer and the node number of output layer.
Particularly, the node number of deep neural network input layer is the dimension of the input acoustic feature of DNN model, particularly carry out the dimension of the rear acoustic feature of consecutive frame splicing, as taked adjacent 11 frames composition acoustic features and be the spectrum signature of 43 dimensions before the splicing of every frame, then the node number of deep neural network input layer is 11*43=473.
The node number of deep neural network output layer decides the distinction of DNN, and its node is for simulating the distribution of HMM state, and to this, most plain mode can be set to the corresponding state of an output layer node.But because in HMM model, status number is too much, the set-up mode of a single state output node easily causes DNN scale too huge, so often first classified to HMM state by the method for cluster in actual applications, and make the corresponding state subclass of each node of output layer.Each node of usual output layer corresponds to each state after decision tree in GMM-HMM is bound, what such as adopted before training DNN is the cutting that the GMM-HMM of 3000 states carries out training data mark, then the node number of the output layer of DNN is 3000, and is one to one with 3000 states of this GMM-HMM.
Step 102: obtain training data.
Step 103: determine the number of plies of deep neural network hidden layer and the node number of ground floor hidden layer.
Particularly, the number of plies of deep neural network hidden layer sets according to the attribute of obtained training data, and wherein, the attribute of training data comprises following several: have large vocabulary, continuous print, discrete etc.In a preferred embodiment of the invention, for there is large vocabulary, number of plies L=6 that hidden layer is set continuous print training data usually, for the number of plies L=3 usually arranging hidden layer discrete training data.
The node number of the ground floor hidden layer of deep neural network sets according to the data volume of obtained training data, and wherein, the data volume of training data refers to the hourage of training data.In a preferred embodiment of the invention, for the training data of data volume more than 100 hours, the node number N of ground floor hidden layer is set usually
1=2048, data volume is less than to the training data of 100 hours, the node number N of ground floor hidden layer is set usually
1=1024.
Step 104: according to the node number of the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determines the node number of follow-up hidden layer, and the node number of different hidden layer is tapered off change.
Particularly, can according to the data volume determination decreasing fashion of described training data and ratio of successively decreasing, then according to the node number of the decreasing fashion determined and the follow-up hidden layer of the ratio-dependent that successively decreases.
Step 105: the model parameter of described deep neural network obtains deep neural network to utilize described training data to determine.
Particularly, model parameter is here the weight parameter of deep neural network, consistent with defining method of the prior art to the defining method of the model parameter of deep neural network according to training data, does not repeat them here.
Below to the node number of the follow-up hidden layer of determination that above-mentioned steps 104 relates to, and make the node number of different hidden layer taper off change method be described in detail.
As shown in Figure 2, be a kind of process flow diagram of method of the node number determining follow-up hidden layer in the embodiment of the present invention.
In general, the data volume of training data is larger, needs the node number retaining more hidden layer, otherwise then only needs the node number of less hidden layer.Based on this, in this embodiment, when determining the node number of follow-up hidden layer, the node number of different hidden layer can be made to taper off change, specifically can comprise the following steps:
Step 201: according to
determine the ratio p that successively decreases, wherein H is the data volume of training data, and the value of ratio of successively decreasing meets 1≤p≤2.When the data volume of training data is less, the node number of the hidden layer that we need is less, therefore ratio p is successively decreased close to 2, when the data volume of training data is increased to 10000 constantly little gradually, the ratio p that successively decreases drops to 1 gradually from 2, and when the data volume of training data is constantly little more than 10000, the ratio p that successively decreases remains on 1, namely the node number of each hidden layer is identical, then current deep neural network be traditional hidden layer node number identical, the feed-forward type deep neural network of standard.
Step 202: according to
determine the node number N of follow-up hidden layer
k, make for described follow-up hidden layer, the node number N of each hidden layer
kat the node number N of described ground floor hidden layer
1basis on successively successively decrease according to the described ratio p that successively decreases.
As shown in Figure 3, be the another kind of process flow diagram of method of the node number determining follow-up hidden layer.
The method of the node number of the follow-up hidden layer of the determination in the present embodiment is a kind of method for the attribute of training data and the differentiation of data volume: for discrete or that data volume is less training data, the node number of required hidden layer is less, and the node number of hidden layer promptly can reduce and not affect recognition performance; Otherwise for continuous print or the larger training data of data volume, the node number of required hidden layer is more, and in order to ensure that the node number of recognition performance hidden layer can not reduce too fast.Because the number of plies of hidden layer is determined by the attribute of training data, so the method for the node number of the follow-up hidden layer of determination in the present embodiment is a kind of method for the number of plies of hidden layer and the differentiation of data volume.
In this embodiment, when determining the node number of follow-up hidden layer, the number of plies is less than or equal to each hidden layer of number of plies threshold value, the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer; The number of plies is greater than to each hidden layer of described number of plies threshold value, the node number of wherein each odd-level hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous odd-level hidden layer, makes the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.Specifically can comprise the following steps:
Step 301: according to
determine the ratio p that successively decreases, wherein H is the data volume of training data, and the value of ratio of successively decreasing meets 1≤p≤2.When the data volume of training data is less, the node number of the hidden layer that we need is less, therefore ratio p is successively decreased close to 2, when the data volume of training data is increased to 10000 constantly little gradually, the ratio p that successively decreases drops to 1 gradually from 2, and when the data volume of training data is constantly little more than 10000, the ratio p that successively decreases remains on 1, namely the node number of each hidden layer is identical, then current deep neural network be traditional hidden layer node number identical, the feed-forward type deep neural network of standard.
Step 302: judge whether the number of plies L of described hidden layer is less than or equal to number of plies threshold value L
0.
Step 303: if the number of plies L of described hidden layer is less than or equal to number of plies threshold value L
0, then basis
determine the node number N of follow-up hidden layer
k, make for described follow-up hidden layer, the node number N of each hidden layer
kat the node number N of described ground floor hidden layer
1basis on successively successively decrease according to the described ratio p that successively decreases.
Step 304: if the number of plies L of described hidden layer is greater than number of plies threshold value L
0, then according to N
2m-1* p
m-1=N
1, N
2m=N
2m-1determine the node number of follow-up hidden layer, make for described follow-up hidden layer, the node number N of each odd-level hidden layer
2m-1successively decrease according to the described ratio p that successively decreases in the basis of the node number of its previous odd-level hidden layer; The node number N of each even level hidden layer
2mequal the node number of its previous hidden layer.
Illustrate above-mentioned two kinds of methods determining the node number of follow-up hidden layer below: for data volume more than 10000 hours, there is large vocabulary, continuous print training data, the number of plies L=6 of hidden layer can be obtained, the node number N of ground floor hidden layer according to scheme above
1=2048, successively decrease ratio p=1, and thus, the node number of 6 layers of hidden layer of corresponding deep neural network is respectively 2048,2048,2048,2048,2048,2048.For data volume be less than 100 hours, large vocabulary, continuous print training data, the number of plies L=6 of hidden layer can be obtained according to scheme above, the node number N of ground floor hidden layer
1=1024, successively decrease ratio p=2, and the number of plies L due to hidden layer is greater than default number of plies threshold value L
0=3, so the node number that the method according to Fig. 3 can obtain 6 layers of hidden layer of corresponding deep neural network is respectively 1024,1024,512,512,256,256.100 hours, discrete training data is less than for data volume, the number of plies L=3 of hidden layer can be obtained according to scheme above, the node number N of ground floor hidden layer
1=1024, successively decrease ratio p=2, and the number of plies L due to hidden layer equals default number of plies threshold value L
0=3, so the node number that the method according to Fig. 3 can obtain 3 layers of hidden layer of corresponding deep neural network is respectively 1024,512,256.Compared to the neural network that the node number of each hidden layer general is at present 2048, the present invention, while a large amount of reduction neural network parameter, can not lose the recognition performance of the deep neural network being applied to speech recognition.It is worth mentioning that the node number N making ground floor hidden layer
1=3072, when ensureing that neural network population parameter scale is identical, final recognition performance can also be improved.
To sum up, compared with the deep neural network identical with the node number of each hidden layer general at present, the deep neural network that application the present invention builds greatly can reduce the number of parameters of neural network, thus reduces required storage space and accelerate the training speed of model.The particularly DNN-HMM system of the binding state of the speech recognition system use of large vocabulary at present, node number due to output layer can reach 10,000 even more, and the node number reducing last hidden layer effectively can reduce the number of parameters of neural network.In addition, due to the time decreased of computing mode output probability when network parameter minimizing makes to decode, the deep neural network adopting the present invention to build is applied to speech recognition system, the final decoding speed identified can be improved, thus in practice, have better real-time.Have again, method of the present invention does not affect for the recognition performance that speech recognition system is final when network parameter significantly reduces, and under the constant prerequisite of network parameter, recognition performance can also be promoted by the method increased near the hidden layer node number of input layer.Correspondingly, the embodiment of the present invention also provides a kind of constructing system of deep neural network, as shown in Figure 4, is a kind of structural representation of this system.
In this embodiment, the constructing system of described deep neural network comprises:
Input and output layer determining unit 401, for the node number of the node number and output layer of determining deep neural network input layer;
Data capture unit 402, for obtaining training data;
Hidden layer first determining unit 403, the node number for the number of plies and ground floor hidden layer of determining deep neural network hidden layer:
Hidden layer second determining unit 404, for the node number according to the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determines the node number of follow-up hidden layer, and the node number of different hidden layer is tapered off change;
Model parameter determining unit 405, determines that for utilizing described training data the model parameter of described deep neural network obtains deep neural network.
In embodiments of the present invention, a kind of concrete structure of described hidden layer second determining unit can comprise: decline mode determining unit and nodes determining unit, wherein:
Described decline mode determining unit, for according to the data volume determination decreasing fashion of described training data and ratio of successively decreasing;
Described nodes determining unit, for the node number according to the decreasing fashion determined and the follow-up hidden layer of the ratio-dependent that successively decreases.
In actual applications, above-mentioned decline mode determining unit can adopt the decreasing fashion of various ways determination hidden layer node and ratio of successively decreasing.Such as, determine that the node number of follow-up each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer according to the data volume of described training data.For another example, the number of plies is less than or equal to each hidden layer of number of plies threshold value, determines that the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer; The number of plies is greater than to each hidden layer of described number of plies threshold value, determine that the node number of wherein each odd-level hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous odd-level hidden layer, make the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
Correspondingly, the node number of the decreasing fashion that above-mentioned nodes determining unit can be determined according to decline mode determining unit and the follow-up hidden layer of the ratio-dependent that successively decreases, thus greatly reduce the number of parameters of neural network, thus reduce required storage space and accelerate the training speed of model.
Compared with the deep neural network identical with the node number of each hidden layer general at present, the deep neural network of system constructing of the application embodiment of the present invention greatly can reduce the number of parameters of neural network, thus reduces required storage space and accelerate the training speed of model.The particularly DNN-HMM system of the binding state of the speech recognition system use of large vocabulary at present, node number due to output layer can reach 10,000 even more, and the node number reducing last hidden layer effectively can reduce the number of parameters of neural network.In addition, due to the time decreased of computing mode output probability when network parameter minimizing makes to decode, the deep neural network adopting the present invention to build is applied to speech recognition system, the final decoding speed identified can be improved, thus in practice, have better real-time.Have again, method of the present invention does not affect for the recognition performance that speech recognition system is final when network parameter significantly reduces, and under the constant prerequisite of network parameter, recognition performance can also be promoted by the method increased near the hidden layer node number of input layer.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for system embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.System embodiment described above is only schematic, and the wherein said unit that illustrates as separating component and module can or may not be physically separates.In addition, some or all of unit wherein and module can also be selected according to the actual needs to realize the object of the present embodiment scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.
Structure of the present invention, feature and action effect is described in detail above according to graphic shown embodiment; the foregoing is only preferred embodiment of the present invention; but the present invention does not limit practical range with shown in drawing; every change done according to conception of the present invention; or be revised as the Equivalent embodiments of equivalent variations; do not exceed yet instructions with diagram contain spiritual time, all should in protection scope of the present invention.
Claims (10)
1. a construction method for deep neural network, is characterized in that, comprising:
Determine the node number of deep neural network input layer and the node number of output layer;
Obtain training data;
Determine the number of plies of deep neural network hidden layer and the node number of ground floor hidden layer:
According to the node number of the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determine the node number of follow-up hidden layer, and the node number of different hidden layer is tapered off change;
The model parameter of described deep neural network obtains deep neural network to utilize described training data to determine.
2. the construction method of deep neural network according to claim 1, is characterized in that, the described node number determining follow-up hidden layer, and the change that tapers off of the node number of different hidden layer is comprised:
According to data volume determination decreasing fashion and the ratio of successively decreasing of described training data;
According to the node number of the decreasing fashion determined and the follow-up hidden layer of the ratio-dependent that successively decreases.
3. the construction method of deep neural network according to claim 2, is characterized in that, the described data volume determination decreasing fashion according to described training data and ratio of successively decreasing comprise:
The node number of follow-up each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer.
4. the construction method of deep neural network according to claim 2, is characterized in that, the described data volume determination decreasing fashion according to described training data and ratio of successively decreasing comprise:
The number of plies is less than or equal to each hidden layer of number of plies threshold value, the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer;
The number of plies is greater than to each hidden layer of described number of plies threshold value, the node number of wherein each odd-level hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous odd-level hidden layer, makes the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
5. the construction method of the deep neural network according to Claims 2 or 3 or 4, is characterized in that, the ratio of successively decreasing of a kth hidden layer is: 1/p
k-1, wherein, 1≤p≤2.
6. a constructing system for deep neural network, is characterized in that, comprising:
Input and output layer determining unit, for the node number of the node number and output layer of determining deep neural network input layer;
Data capture unit, for obtaining training data;
Hidden layer first determining unit, the node number for the number of plies and ground floor hidden layer of determining deep neural network hidden layer:
Hidden layer second determining unit, for the node number according to the data volume of described training data, the number of plies of hidden layer and ground floor hidden layer, determines the node number of follow-up hidden layer, and the node number of different hidden layer is tapered off change;
Model parameter determining unit, determines that for utilizing described training data the model parameter of described deep neural network obtains deep neural network.
7. the constructing system of deep neural network according to claim 6, is characterized in that, described hidden layer second determining unit comprises:
Decline mode determining unit, for according to the data volume determination decreasing fashion of described training data and ratio of successively decreasing;
Nodes determining unit, for the node number according to the decreasing fashion determined and the follow-up hidden layer of the ratio-dependent that successively decreases.
8. the constructing system of deep neural network according to claim 7, is characterized in that,
Described decline mode determining unit, specifically for determining that according to the data volume of described training data the node number of follow-up each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer.
9. the constructing system of deep neural network according to claim 7, it is characterized in that, described decline mode determining unit, specifically for being less than or equal to each hidden layer of number of plies threshold value for the number of plies, determine that the node number of wherein each hidden layer is successively successively decreased according to described ratio of successively decreasing on the basis of the node number of described ground floor hidden layer; The number of plies is greater than to each hidden layer of described number of plies threshold value, determine that the node number of wherein each odd-level hidden layer is successively decreased according to described ratio of successively decreasing on the basis of the node number of its previous odd-level hidden layer, make the node number of wherein each even level hidden layer equal the node number of its previous hidden layer.
10. the constructing system of the deep neural network according to claim 7 or 8 or 9, is characterized in that, the ratio of successively decreasing of a kth hidden layer is: 1/p
k-1, wherein, 1≤p≤2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310755400.9A CN104751227B (en) | 2013-12-31 | 2013-12-31 | Construction method and system for the deep neural network of speech recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310755400.9A CN104751227B (en) | 2013-12-31 | 2013-12-31 | Construction method and system for the deep neural network of speech recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104751227A true CN104751227A (en) | 2015-07-01 |
CN104751227B CN104751227B (en) | 2018-03-06 |
Family
ID=53590872
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310755400.9A Active CN104751227B (en) | 2013-12-31 | 2013-12-31 | Construction method and system for the deep neural network of speech recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104751227B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106096727A (en) * | 2016-06-02 | 2016-11-09 | 腾讯科技(深圳)有限公司 | A kind of network model based on machine learning building method and device |
WO2017076211A1 (en) * | 2015-11-05 | 2017-05-11 | 阿里巴巴集团控股有限公司 | Voice-based role separation method and device |
CN106898354A (en) * | 2017-03-03 | 2017-06-27 | 清华大学 | Speaker number estimation method based on DNN models and supporting vector machine model |
CN108122035A (en) * | 2016-11-29 | 2018-06-05 | 科大讯飞股份有限公司 | End-to-end modeling method and system |
CN108648769A (en) * | 2018-04-20 | 2018-10-12 | 百度在线网络技术(北京)有限公司 | Voice activity detection method, apparatus and equipment |
CN109034372A (en) * | 2018-06-28 | 2018-12-18 | 浙江大学 | A kind of neural networks pruning method based on probability |
CN109102067A (en) * | 2018-07-13 | 2018-12-28 | 厦门快商通信息技术有限公司 | The method of increase and decrease certainly, computer equipment and the storage medium of neural network node |
CN109154798A (en) * | 2016-05-09 | 2019-01-04 | 1Qb信息技术公司 | For improving the method and system of the strategy of Stochastic Control Problem |
CN109255432A (en) * | 2018-08-22 | 2019-01-22 | 中国平安人寿保险股份有限公司 | Neural network model construction method and device, storage medium, electronic equipment |
CN109697506A (en) * | 2017-10-20 | 2019-04-30 | 图核有限公司 | Processing in neural network |
CN111883172A (en) * | 2020-03-20 | 2020-11-03 | 珠海市杰理科技股份有限公司 | Neural network training method, device and system for audio packet loss repair |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040076944A1 (en) * | 2002-08-22 | 2004-04-22 | Ibex Process Technology, Inc. | Supervised learning in the presence of null data |
US7756646B2 (en) * | 2006-03-31 | 2010-07-13 | Battelle Memorial Institute | Method for predicting peptide detection in mass spectrometry |
CN102411931A (en) * | 2010-09-15 | 2012-04-11 | 微软公司 | Deep belief network for large vocabulary continuous speech recognition |
CN102496059A (en) * | 2011-11-25 | 2012-06-13 | 中冶集团武汉勘察研究院有限公司 | Mine shaft well engineering surrounding rock artificial intelligence stage division method |
CN103049792A (en) * | 2011-11-26 | 2013-04-17 | 微软公司 | Discriminative pretraining of Deep Neural Network |
CN103400577A (en) * | 2013-08-01 | 2013-11-20 | 百度在线网络技术(北京)有限公司 | Acoustic model building method and device for multi-language voice identification |
-
2013
- 2013-12-31 CN CN201310755400.9A patent/CN104751227B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040076944A1 (en) * | 2002-08-22 | 2004-04-22 | Ibex Process Technology, Inc. | Supervised learning in the presence of null data |
US7756646B2 (en) * | 2006-03-31 | 2010-07-13 | Battelle Memorial Institute | Method for predicting peptide detection in mass spectrometry |
CN102411931A (en) * | 2010-09-15 | 2012-04-11 | 微软公司 | Deep belief network for large vocabulary continuous speech recognition |
CN102496059A (en) * | 2011-11-25 | 2012-06-13 | 中冶集团武汉勘察研究院有限公司 | Mine shaft well engineering surrounding rock artificial intelligence stage division method |
CN103049792A (en) * | 2011-11-26 | 2013-04-17 | 微软公司 | Discriminative pretraining of Deep Neural Network |
CN103400577A (en) * | 2013-08-01 | 2013-11-20 | 百度在线网络技术(北京)有限公司 | Acoustic model building method and device for multi-language voice identification |
Non-Patent Citations (2)
Title |
---|
刘维群等: "BP网络中隐含层节点优化的研究", 《交通与计算机》 * |
高大文等: "人工神经网络中隐含层节点与训练次数的优化", 《哈尔滨工业大学学报》 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017076211A1 (en) * | 2015-11-05 | 2017-05-11 | 阿里巴巴集团控股有限公司 | Voice-based role separation method and device |
CN109154798B (en) * | 2016-05-09 | 2022-02-25 | 1Qb信息技术公司 | Method and system for improving strategies for stochastic control problems |
CN109154798A (en) * | 2016-05-09 | 2019-01-04 | 1Qb信息技术公司 | For improving the method and system of the strategy of Stochastic Control Problem |
WO2017206936A1 (en) * | 2016-06-02 | 2017-12-07 | 腾讯科技(深圳)有限公司 | Machine learning based network model construction method and apparatus |
CN106096727A (en) * | 2016-06-02 | 2016-11-09 | 腾讯科技(深圳)有限公司 | A kind of network model based on machine learning building method and device |
JP2018533153A (en) * | 2016-06-02 | 2018-11-08 | テンセント・テクノロジー・(シェンジェン)・カンパニー・リミテッド | Network model construction method and apparatus based on machine learning |
CN106096727B (en) * | 2016-06-02 | 2018-12-07 | 腾讯科技(深圳)有限公司 | A kind of network model building method and device based on machine learning |
US11741361B2 (en) | 2016-06-02 | 2023-08-29 | Tencent Technology (Shenzhen) Company Limited | Machine learning-based network model building method and apparatus |
CN108122035A (en) * | 2016-11-29 | 2018-06-05 | 科大讯飞股份有限公司 | End-to-end modeling method and system |
CN108122035B (en) * | 2016-11-29 | 2019-10-18 | 科大讯飞股份有限公司 | End-to-end modeling method and system |
CN106898354A (en) * | 2017-03-03 | 2017-06-27 | 清华大学 | Speaker number estimation method based on DNN models and supporting vector machine model |
CN106898354B (en) * | 2017-03-03 | 2020-05-19 | 北京华控智加科技有限公司 | Method for estimating number of speakers based on DNN model and support vector machine model |
CN109697506A (en) * | 2017-10-20 | 2019-04-30 | 图核有限公司 | Processing in neural network |
CN108648769A (en) * | 2018-04-20 | 2018-10-12 | 百度在线网络技术(北京)有限公司 | Voice activity detection method, apparatus and equipment |
CN109034372B (en) * | 2018-06-28 | 2020-10-16 | 浙江大学 | Neural network pruning method based on probability |
CN109034372A (en) * | 2018-06-28 | 2018-12-18 | 浙江大学 | A kind of neural networks pruning method based on probability |
CN109102067B (en) * | 2018-07-13 | 2021-08-06 | 厦门快商通信息技术有限公司 | Method for increasing and decreasing neural network nodes, computer device and storage medium |
CN109102067A (en) * | 2018-07-13 | 2018-12-28 | 厦门快商通信息技术有限公司 | The method of increase and decrease certainly, computer equipment and the storage medium of neural network node |
CN109255432A (en) * | 2018-08-22 | 2019-01-22 | 中国平安人寿保险股份有限公司 | Neural network model construction method and device, storage medium, electronic equipment |
CN109255432B (en) * | 2018-08-22 | 2024-04-30 | 中国平安人寿保险股份有限公司 | Neural network model construction method and device, storage medium and electronic equipment |
CN111883172A (en) * | 2020-03-20 | 2020-11-03 | 珠海市杰理科技股份有限公司 | Neural network training method, device and system for audio packet loss repair |
CN111883172B (en) * | 2020-03-20 | 2023-11-28 | 珠海市杰理科技股份有限公司 | Neural network training method, device and system for audio packet loss repair |
Also Published As
Publication number | Publication date |
---|---|
CN104751227B (en) | 2018-03-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104751227A (en) | Method and system for constructing deep neural network | |
CN104751228A (en) | Method and system for constructing deep neural network | |
CN104143327B (en) | A kind of acoustic training model method and apparatus | |
CN107545903B (en) | Voice conversion method based on deep learning | |
CN103400577B (en) | The acoustic model method for building up of multilingual speech recognition and device | |
CN104538028B (en) | A kind of continuous speech recognition method that Recognition with Recurrent Neural Network is remembered based on depth shot and long term | |
CN109119072A (en) | Civil aviaton's land sky call acoustic model construction method based on DNN-HMM | |
CN107785015A (en) | A kind of audio recognition method and device | |
CN107767861B (en) | Voice awakening method and system and intelligent terminal | |
CN105427869A (en) | Session emotion autoanalysis method based on depth learning | |
CN104036774A (en) | Method and system for recognizing Tibetan dialects | |
CN104751842A (en) | Method and system for optimizing deep neural network | |
CN106611597A (en) | Voice wakeup method and voice wakeup device based on artificial intelligence | |
CN102779510B (en) | Speech emotion recognition method based on feature space self-adaptive projection | |
CN104700828A (en) | Deep long-term and short-term memory recurrent neural network acoustic model establishing method based on selective attention principles | |
CN103117060A (en) | Modeling approach and modeling system of acoustic model used in speech recognition | |
CN107393542A (en) | A kind of birds species identification method based on binary channels neutral net | |
CN104952448A (en) | Method and system for enhancing features by aid of bidirectional long-term and short-term memory recurrent neural networks | |
CN105096941A (en) | Voice recognition method and device | |
CN109754790B (en) | Speech recognition system and method based on hybrid acoustic model | |
CN111862942B (en) | Method and system for training mixed speech recognition model of Mandarin and Sichuan | |
CN111179917B (en) | Speech recognition model training method, system, mobile terminal and storage medium | |
CN110070855A (en) | A kind of speech recognition system and method based on migration neural network acoustic model | |
CN109754789A (en) | The recognition methods of phoneme of speech sound and device | |
CN113674732B (en) | Voice confidence detection method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666 Applicant after: Iflytek Co., Ltd. Address before: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666 Applicant before: Anhui USTC iFLYTEK Co., Ltd. |
|
COR | Change of bibliographic data | ||
GR01 | Patent grant | ||
GR01 | Patent grant |